But what happens when they get bored with map data and get rid of it?
He had been ordered to turn over all of their historical arial archives for scanning by Google, and then told the USGS would no longer do arial scanning since Google was doing it. But there was no agreement for Google to turn over their arial scans back to the USGS.
At the time we all told him not to worry, Google would never remove data it had collected. Looks like he was a lot smarter than us.
It's not just celebrities, so many independent artists are putting up their talent on Instagram and I don't have access to any of it because I need an Instagram account for that. Instagram web version is forcing to sign up if you scroll 1 page down on a profile.
Sometimes I feel like we need to build cutting edge decentralized applications that will burn these walled gardens to the ground. /rant
Where's the non-proprietory decentralized platform that lets me reach as many people as I can on Facebook? There isn't one.
Why aren't the social functionality of identity / friends / followers / newsfeed / etc. built into browsers in a standardized way?
Facebook is 16 years old. That was a lot of time to figure out an alternative solution, but all we have are experimental projects that rely on adoption that they don't have to be useful.
Corporations aren't going to change how they behave, but it's annoying that us techies are apparently incapable of beating them at our own game.
Someone else mentioned that you can't reach as many people from your silo-ed website as you can if you go through social networks. I found one way you could get best of both worlds - through Medium's import feature[1]. But I don't yet know how effective that is.
Here's a short write-up in case anyone's interested: [2]
[1]: https://help.medium.com/hc/en-us/articles/214550207-Import-a...
[2]: https://ketanvijayvargiya.com/58-setup-blog-and-email-on-cus...
But... many of the same companies will fill your search results or fill affiliate pages with quackery ads just fine...
In my country, all physical books and magazines which are published must be submitted to the government in X copies. The government then keeps an archive.
With webpages, the problem of obtaining X copies never existed. Why couldn't the government have archived webpages like it always did with books?
However, if they create a monopoly on that data they have an obligation to preserve it, especially in the case of a corporation outright aquiring data instead of simply "out competing" for data. And as everyone mentions, of course they are in no way legally obligated to do so, but they are by any reasonable standard ethically obligated.
I do think that the government could and should archive data, but there is currently no system in place for doing so and likely will not be for a long time, if ever. Corporations would simply have to maintain the data that they already have.
They were sharing information with the whole world, but in an ephemeral medium.
The web, and internet, is not an inhospitable place for anyone without corporate backing. You can host a somewhat reliable service on a raspberry pi over your home internet connection.
Some of that is because search engines have simply stopped returning them in results even though they're still online.
Jeez, that's horrifying. Literally just giving public assets to private corporations.
If any entity with a plausible use case could and still can get that data at the cost of the copy, I don't see why not. The whole "copying does not deprive the original owner" meme applies particularly to such public assets.
Can you point me to where I can download this data for the cost of a copy? Didn't think so.
If the effort to USGS could be quantified in a cost, I'd expect Google to pay USGS to make the public data available?
It does sound awful. I don't know what the right answer is.
1. A corporation is not a person. Corporations don't have rights, except inasmuch as the people within the corporation have rights.
2. The problem isn't that Google has access to the data, it's that USGS and the rest of the world no longer have access to the data, except on Google's terms.
That was poor negotiation by USGS Solicitor's Office. Libraries participating in google digitization programs negotiated to keep copies of their scanned materials in the Hathi Trust Digital Library https://www.hathitrust.org
They are still figuring out what a good way for archival of those is and are quite selective in choice what they archive, but they plan to expand on that
German page: https://www.dnb.de/DE/Sammlungen/DigitaleSammlungen/dgitaleS... English page: https://www.dnb.de/EN/Sammlungen/DigitaleSammlungen/dgitaleS...
https://www.usgs.gov/core-science-systems/ngp/3dep/3dep-data...
Making digital data publicly available is pretty new for USGS. Just a few years ago archived aerial imagery had to be ordered by mail and it was a pretty lengthy process. Topo maps (the earlier equivalent of the DEM data to which you refer) were generally ordered on paper as well up to five or so years ago, but they're in a lot more popular use so more third parties got into the business of distributing them. I've relied moderately heavily on both for some of my research and was a very painful process until just recently to get anything older than current. In the meantime, yes, Google had it all at some point, but mostly stopped using it or providing it because they obtained better quality imagery.
Fortunately USGS now has a slippy map for topo and an admittedly rather clunky ESRI query service for aerials.
But we're just not smart enough to understand that, never mind make it happen.
Instead we prefer to cling to the bizarre delusion that billions of individuals with competing interests will somehow spontaneously self-organise into the best of all possible worlds.
However, to be fair, it has always been the most greedy and self-interested, with already the most disproportionate power to rig the game in their favor, that have been most vocally advocating this system. No surprise there, of course.
What fascinates me is how a majority of people, who certainly do not personally benefit from that system, have been made to believe that they do. Sure, political corruption, cultural indoctrination/propaganda, horrendous general education, and I can think of a few more .. but still I've always been amazed, how it appears to have canceled even basic logic reasoning among so many.
Who knows, maybe one day, it will turn out to not just "correlate" with an addiction to carbs/sugars, of which the country has plenty of problems with too. Junkies have always been easy to manipulate.
Until then, at least it still gives some hope that a growing number of people now realize that this system just doesn't work as it is advertised.
It's fine when you travel by car, but when I'm hiking through the hills I'm just walking through an empty square on Google Maps. Volunteer-driven OpenStreetMap is MUCH better. And there the data is actually open and safeguarded.
Governments should support that kind of project instead of corporate privacy-invading playtoys like Google Maps.
Anecdotally, a close relative (and many others in her institute) designed entire curricula of learning modules for a government-owned nationwide technical college, back when online learning was newish, ~20 years ago (I think back when SCORM was fresh). These were tightly integrated into the traditional in-class offerings. A couple of years later a "trim the fat" government slashed internal capabilities and outsourced all "IT" hosting, management, etc.
All of the online learning modules (which would have cost millions in man-hours to develop) were literally handed over as "content" to a company who to this day offers them back to her institute under per-student licenses (that far exceed any "hosting" costs of these basically static resources) over a decade later. This company also profits off licensing to an array of pop-up online "institutes" that don't even approach the pedagogical context needed to ensure quality education outcomes from these resources.
Like a comedy of errors, from time to time some lecturer at her college will want to ask some question about the materials, their boss directs them to the company support (which is a paid service), after the issue escalates through the support tiers and they realise they need the expert knowledge of the author she'll get an email with the question, a process that can take days or weeks when the lecturer could have walked into the office next door and asked her directly, if the company hadn't stripped all author credits from the materials.
If the company decides to shift business models, or goes out of business, or is acquired and scuttled, these assets get blown to the winds.
There's a lot I could say about this situation, but essentially governments in general seem to devalue their assets at taxpayer expense, the IP of these assets could have been better handled rather than just giving it directly to the first company to win the contract all those years ago.
A font of knowledge
I am a holdout.
(Not suggesting I am "smarter" than Go users, but I can forsee issues with Go being controlled by Google.)
It sounds awful that Google has the best mapping data in the US. In the UK Google's data is awful, worse than OpenStreetMap and much worse than Ordnance Survey, the national mapping agency.
Anything that is (no longer) of commercial value will be "phased out" and dismantled/destroyed. One might still stretch it a bit, by arguing that the commercial value of something can include its future potential value. But I personally know not a single commercial companies that ever choose that over short term cost reductions and "profit optimizations".
Luckily, there are governments who acknowledge this shortcoming and build structures to compensate for it. But when governments decide to leave (almost) everything to commercial markets, then the importance of anything and everything can and will only be measured by it's commercial (contemporary) value/profitability.
People have every right to vote for and support such a system. But then don't complain, when all that you will get is only what such system supports/provides.
Googlewashing - to proclaim “Google would never ...”
Looks like he was a lot smarter than us.
If you would've asked me back when Google was new, and we all believed in "Don't be Evil," I would never have thought that Big Tech would end up being the Ministry of Truth and The Memory Hole.
(This was at the same time that there was a gold rush of IPO plays, hiring anyone who could spell "HTML", and plopping them down in slick office space, Aerons for everyone, and lavish launch parties, with tons of oblivious posturing and self-congratulating. But Google stood out as looking technically smart, at least I believed the "Don't Be Evil", since that was the OG culture, and it seemed a savvy reference to behaviors in industry and awareness of the power that it was clear they would probably have.)
That might be why it wasn't surprising to hear of things like someone entrusting a bunch of old university backup tapes to Google's stewardship.
This has played out with mixed results, and I think Google could be doing much better for humanity and for techie culture.
If you look at the history, Google basically rescued the data from a collapsing Deja News, and made it available again. A nice gesture, which didn’t serve to benefit Google much in the long term.
If we want to preserve history then we can’t rely on for-profit companies. We need to instead fund non-profits whose specific charter is archival and preservation, like the Internet Archive.
Given the nature of Usenet, they were if anyone wanted them.
What Google is doing by refusing to publish the archive or even share it with parties like the Internet Archive is completely unjustifiable and anathema to everything they once stood for.
Couldn't a copyright claim (or something under the GDPR or UK's DPA) be used to regain access to those though?
Just because something is published to a public forum doesn't mean you relinquish your rights.
They cared enough about to kill it.
Personally I'd like to be able to link to my own posts from that time, for when people asked me what I used to do. But I can't find them any more.
These groups are mostly not code. They are conversations, design discussions, ideological discussions, jokes, that sort of thing.
Like what we have now in social media, except back then there was pretty much only Usenet, and it had a very different feel than the current social networks.
They are where things ideas like the smiley, and free and open source software, and utopian ideas of internet culture were developed. All the early internet memes. And of course all the knowledge people shared.
Conducted in public at the time and thought to be archived for the long term.
You're right though that a decision will probably have to be made at some point about what to keep and what to toss (how big is YouTube, exactly? Are we really going to keep every video, in its original resolution, forever?), but this is just plaintext, it takes up almost no space. The decision doesn't even have to be made, since it's easy to find the means to store this, so why bother making it? Kicking the can down the road is actually the best decision in this case, since the people of the future will (hopefully) have a clearer understanding about what was important in our own past than we do currently.
It's because, at the time, you don't know what information is going to be important and what is just garbage. Documents that are apparently useless today could become fascinating tomorrow.
Interestingly, when some people saved a great deal of the Usenet archives pre-Deja News, one of them said something to the effect of they wished they had prioritized saving social discussions and so forth because, by and large, saving discussions about a bug in a long ago version of SunOS probably wasn't very interesting.
Honestly even that sounds pretty fascinating:
It could help someone gather stats on the nature, frequency, and severity of bugs over time and across companies from another angle.
It could provide a fresh perspective on modern OSes by showing how historic OSes did things.
And it might be good material for a course on the history of software engineering practices, showing classes of bugs that have been eliminated, and styles of development and customer support that worked or didn't work.
Those archives are full of useful and informative information.
Not everthing changes fast. Common Lisp has been around for 30 years basically unchanged. The discussions back there can be truly informative for today.
It does take time to wade thought it, but people have been collecting (via the google archive, when it existed, sigh) curated lists.
https://www.xach.com/naggum/articles/ https://www.xach.com/rpw3/articles/
There are still interesting things to be learned from ancient artifacts.
And if not, what makes comp.lang more like the pyramids than geocities?
Never forget that we do not know the future.
That or risk future archaeologists thinking COBOL was some God of the time and the natives built large metal obelisks in dedicated worship temples.
likewise many people are clinging to the local operating system rather than moving to the SAAS model.
so what happens if we lose the oldschool languages and platforms entirely, for whatever reason ?
if TBTF corporations are somehow hobbled or neutralized, we need old hand tools to build a tech newtopia from the rubble. if those tools are destroyed then we are beholden to a system that stands on very thin ice.
I second the need to rebuild from the rubble is often overlooked, especially by corporations driven by profit centered goals.
1) Eventually, everything will be lost anyway. The original print of King Kong is gone. A fire at Universal Studios wiped out the masters for a lot of music at once https://en.wikipedia.org/wiki/2008_Universal_fire . Floods destroy family photos all the time. But those are examples of the forces of decay, of natural entropy, of error. The Library of Alexandria probably contained a lot of useless crap but also nuggets we’d want to know today. Information is memories, useful information is useful memories, and there’s no compelling REASON to lose it. Other sections of usenet history were wiped out when Google acquired it (a lot of comp.database.olap content I had a hand in) and groups of people just lost a knowledge base.
2) It’s not simply code that no one uses anymore. It’s a knowledge base on how and why, debates over constructs and usage that are useful beyond code-sharing snippets a la Stack Overflow.
3) There is an argument for letting some information get lost or at least super-obscure, but it’s hard to see this being a good example. Tide Pod Challenge videos come to mind. GDPR and right to be forgotten mandate something akin to information loss.
4) I posted this elsewhere but I’ll share here too: there was a comment made on the original article about preserving prior art for IP (patent) purposes. That alone is in the public interest. Irrelevant to your questions in general, but pertinent to each of them in this case.
To fix problems caused by Google, you need to change the principles of competition law. Microsoft was knowingly doing lots of stuff that violated laws. It was just very hard to prove it.
I mean, it was all in the news, trade magazines, business journals. Blackmailing OEM's, intentionally breaking things and making them incompatible. At least the legal battles are documented somewhere and Wikipedia has something about them, but they were just the tip of the iceberg.
https://en.wikipedia.org/wiki/Microsoft_litigation
https://en.wikipedia.org/wiki/United_States_v._Microsoft_Cor....
https://en.wikipedia.org/wiki/Browser_wars
There must be book somewhere.
Dan Gilmour's articles in San Jose Mercury news from 90's should be somewhere.
Basically small software startups had to have Microsoft Strategy. They had to find way to stay out of Microsoft radar or MS would steal their work, their developers or block them. You sue them like Stack did and MS just stalls few years and pays few millions in damages. It was worth of losing in court to protect monopoly.
Big OEM's like Dell had to do what MS said or MS would up their price. It was straight blackmail from monopoly position.
They sucked the air out of advertising (in cooperation with Facebook) leaving none for others. But I consider that a small loss.
Microsoft did that for operating systems, productivity software, stalled the web with IE6, and more.
Google is capable of much more damage, for sure. But they haven’t done that damage just yet.
That is changing extremely fast.
Easiest example is with RSS - entered the RSS Reader market for free and at a loss and effectively killed competition because you cannot compete with that. Then subsequently killed Google Reader. This chain of moves essentially drove RSS to being obsolete which in turn made everyone far more reliant on Google and social media.
Now extend this to other products that they’ve started for free and subsequently killed. It’s not the same as embrace, extend, extinguish, but the result is the same. You kill off competition and stunt progress.
It’s mostly that RSS isn’t monetizable as easily as web pages. I think FB and Twitter dropping their feeds had a more significant effect; regardless, RSS was always niche.
I beg the difference, Gmail have not changed much since I signed up 16 year:ish etc
They are all the same, as soon as competition goes away, this happens.
If Gmail required emails themselves to be in a special format that broke other MUA and IE6 wouldn't render standards compliant emails in a way you could read. That would be analogous to what IE6 was up to.
By inventing XMLHTTPRequest?
Android, GSuite, Chrome
So I do think they have an obligation either a) to make the whole archive available for anyone or b) maintain it properly.
Properly means restoring the fast UI from around 2004.
It's probably not a good idea to depend on a public company to steward an important community.
Does the Internet Archive have copies of all the old stuff at least?
Which is sad, but expected.
That'd be an improvement.
Page & Brin retain controlling interest, despite their minority stake.
Dejanews was the seed material for Google Groups, any profit derived from that (ads) was from content posted to Usenet by people who never intended for it to be used for that.
I - as most of us - have a personal google account, and our company uses a google business account. While I'm following news regarding google cancelling accounts at will, I fail to notice a reliable pattern: (alleged) fraud and other illegal stuff seems to comprise a good part of it, but at most 30-50%.
I treat all Google accounts as throwaways now and don't use the work email at all because I want to know that I can actually receive emails that are sent to me. That's a huge problem even without randomly losing access, because their spam filter has a ton of false positives and those emails don't get forwarded to my real address.
Play Music has not been shut down (yet), and you can transfer everything to Youtube Music, which is available at the same price (and in my opinon a superior product).
Spotify is generally better than Play Music though, so it was for the best in the end.
a) Spy on people and sell the data to advertisers.
b) Use that data to directly push ads
That's basically incompatible with b2b services. Or consumer services. As a customer you're judged by how valuable the data they are collecting on you is. Which is less than a support call costs. That bleeds into every facet of their business. As such even if you pay them money you get the same treatment because they can't think any different.
I have no idea how useful the collection may prove to be. I found 'comp' but it doesn't offer a webpage view, just a link to download a file. https://archive.org/details/usenet-comp
I think you have to register. Not sure how much history is there.
Blocking posting access to these newsgroups from GG is generally a good thing for those newsgroups.
Not being able to search the archive is the unfortunate collateral damage though. Google is not obliged to provide a Usenet archive, I suppose.
Formerly obtained deep links to the content also do not work!
If you formely cited a comp.lang.lisp article by giving a direct link into Google Groups, people navigating it now get a permission error.
NNTP is a wonderful protocol, arguably the simplest of the 4 mailnews protocols (IMAP, POP, SMTP, and NNTP). While it seems to share the same basic format as RFC822 messages, it actually tends to avoid some of the more inane issues with the RFC822 formatting (generally prohibiting comments and whitespace folding).
Unfortunately, the internet by the early 2000s started turning more and more into an HTTP(S)-only zone. Usenet itself hemorrhaged its population base, especially as ISPs shut down their instances (e.g., because someone found one child porn instance somewhere in alt.binaries.*).
(Though mere long article retention is not necessarily the best archive interface, of course.)
Disclaimer: I'm not well-versed in the solutions in this space. Maybe there is some NNTP cacher out there that also has a web archive interface into it or whatever.
https://github.com/DigitalMars/ngArchiver
and the generated pages:
https://digitalmars.com/d/archives/digitalmars/D/index.html
When we were working on the history of the D programming language paper, this was an invaluable resource.
There is also https://www.eternal-september.org/ which I used.
AOIE requires no authentication. The Eternal September server requires account registration via the web site; then you use an authenticated NNTP connection.
There are other servers out there.
These sites do not provide any archive.
1) Hiring standards have drifted downwards over the past 15 years. Google used to be super-elite, compact, do-no-evil, massive-profit-per-employee. It's now a 140,000 person organization, and at that scale, standards just aren't high. You have a team of dozens of incompetent people doing what one person used to do.
2) With COVID19, ad revenues have crashed. It's not clear the impact on Google.
3) The smart, ethical folks on top (folks like Larry, Sergei, and Eric) are gone, and replaced with professional managers. They were smart to pick an internal CEO, but most of their executive team comes from places like Microsoft, Oracle, or Morgan. Having known a number of professional executives, the key skill is climbing executive ladders and moving into positions of power, not running successful companies.
4) Their products are increasingly starting to crash-and-burn, especially in B2B. Their culture relies on automated systems over people, and their automated systems have taken down tons of mission-critical businesses. Automated works well at 1000 people supporting 7 billion in B2C (small elite team model), and not so well for a massive, 100k person company.
5) I've switched mostly to non-Google products because they're better for what I need. AOL was massive too at one point. Losing the tech edge is not good. I still use gmail.
On the other hand, their revenues have continued to rise exponentially since they started. So perhaps they're doing fine?
In google’s case you can see this boredom on Android, on the number of products announced and casually killed (they might be excellent standalone products but can’t move the needle on earnings for the benefit of Wall Street so why bother?).
Contrast this to early Intel in the Grove era: they were on top of the world with the memory business so they pivoted to something else. Google has had the same two products for almost 20 years. The later Intel has been more like that.
Another contrast: they don’t know what to do about the advertising downturn, so are cutting back on hiring and such, while FB is trying to double down.
Also, and this may be a bit of a tangential point, but the "deny the past because it has something bad" that Google has effectively done here is uncomfortably close to the set of recent and far more political events.
You just reminded me of a quote from an electronic music documentary 25 years ago. One of the Detroit techno artists insisted on taking the filmmakers to a historic theatre that had been left to crumble & turned into a car park:
"In America especially, nobody tends to care about these kinds of things. People in America tend to let this shit just die, let it go. No respect for the history. I, being a techno, electronic, high-tech futurist musician, I totally believe in the future! But as well, I believe in a historic and well kept past. I believe there are some things that are important. Now, maybe this is more important like this, because in this atmosphere, you can realize how much people don't care, how much they don't respect. And it can make you realize how much you should respect."
- Derrick May, DJ/Composer, Universal Techno (1996)
https://youtube.com/watch?v=tdox6H7FJBU&t=955s
The segment starts at 16:00 in the video and is about 2 minutes long.
You may be surprised that it's not just companies. It's not hard to find people who think it's better for old stuff to just be deleted.
https://groups.google.com/forum/#!forum/comp.lang.forth
> Banned Content Warning
> The group that you are attempting to view (comp.lang.forth) has been identified as containing spam, malware or other malicious content. Content in this group is now limited to view-only mode for those with access.
> Group owners can request an appeal after they have taken steps to clean up potentially offensive content in the forum. For more information about content policies on Google Groups, please see our Help Centre article on abuse and our Terms of Service.
There's no content available for me.
Aside from the spam, it gradually switched from passionate but respectful debates to name calling and plain insults from newbies to what remained of the veterans.
One could read very long arguments between Elizabeth D. Rather, CEO at that time of Forth, Inc. which she founded with C. Moore somewhere in the 70ies, and Jeff Fox (RIP), who was working at that time with Moore; Moore left his first company to pursue its adventures in hardware, making different "Forth processors", which eventually led to the RTX2000 which powered, notably, the Rosetta probe.
[ SELF FOOT SHOOT ] 1000 REP
Brb archiving my Twitter posts
Well, you assume. Maybe it was just decentralized enough you haven't heard about it.
Looks like there has been (likely automated, nearly all of them are the same Italian phrase) mechanical legal complaints and it probably caused this instance of automated blocking going wild.
As an engineer I can understand the desire to automate everything, but please at least have some heuristics to detect this kind of easy-to-detect mechanical behavior before giving the model a full authority to block anyone it doesn't like.
A Genoese lawyer has been a victim of harassing and heavy doxing for some time, you can find many twitter accounts accusing him of paedophilia in cahoots with epstein, berlusconi, the pope and so on (no, I'm not kidding; clearly the stalker has obvious mental sanity problems).
The stalker is very prolific and is wallpapering the internet with his copy-paste-accouse in every corner, from newspaper comments to ancient forums to usenet. The lawyer report and ask for removation where he can but also he does not seem very worried because it seems that this issue goes on from two years ...
I don't think I can say the name of the subjects in question but in any case I'm archiving the harassment accounts before proceeding with the report, then I'll try to get in touch with the lawyer and see if he can request a new, less "coarse" censorship.
I had a UUCP news feed from a local internet provider when I was in high school, back in 1993 or so.
Was I naive in thinking that The Internet Archive would have long archived this type of thing?
If you want to look, you might start here: https://archive.org/details/usenet
These groups should be putting more effort into federalisation and decentralisation. Make it possible to store all of this data in a distributed fashion and stop relying on a central authority for archiving purposes.
The problem is that there is no other searchable archives.
Like when Google decided it's going to host comp.lang.c, can there be only one comp.lang.c on the internet, or can someone else start hosting comp.lang.c as well?
Not safe for work!!!
In fact Usenet predates spam itself, since the first spam (Canter & Siegel) was on Usenet itself in 1994 (I was there).
Thanks for posting this, it reminded me to donate again to archive.org, which I just did.
I use ‘culture’ to include anything creative, anything that we experience as humans. Everything should be preserved, schools should be well funded, as should the arts.
> In 2009, Ron Garret published a 700MB archive file of all of comp.lang.lisp
Of course the downside of Usenet was most people expected conversations to disappear after a couple weeks or a month but there was always some jerk that kept everything and refused to delete anything.
This is really bad marketing
The support ticket was deleted, so I guess not.
Once you see things in this light, the new flavor of the month online service just doesn't hold any allure.
> Has anyone (EFF?) considered the aspect of destroying evidence of prior art in the public domain?
I think there’s a case to be made for stewardship of these groups for that reason.
I guess there was this unjustified assumption that google only adds & never subtracts.
Time to de-Google the whole Web.