Instead, it's just 'what about old-fashioned websites, plus lots of xml schema and long spec documents'? It just tastes like a rehash of Berners-Lee's existing '5-star open data' schpiel ( https://5stardata.info/en/ ) but now with the billing that it'll fix the internet. 5-star open data has been around for years now, and, well, the linked data future isn't here. When's the last time you consumed RDF in an application?
Ultimately I think there are technical solutions to making the decentralized web more attractive than the walled gardens, but at this point they will need to be ridiculously polished and shiny to even get a look, and this stuff... is not. Going forward it gets even worse, they're going to be opposed at every step by corporations with more money than most nations.
The internet was originally decentralized because the government wanted to make it that way, and I think the only way to get back there is going to require a gigantic, economically unattractive investment. There are at least a few governments that may have the capability but I can't name one that would have the motivation. Hopefully some billionaire's charity will decide saving the internet is a worthy legacy.
The internet doesn't really tolerate serious technical barriers stopping someone from automatically multiplexing the content from various social networks into a single read-write stream, for example. The issue is that when someone attempts to do that kind of thing, they get sued and they end up owing BigTechCo millions of dollars. [0]
An open internet is _not_ a technical issue. It's a legal one.
E-mail is pretty much the last bastion of the old open Internet, and the amount of resources needed to just deal with malicious e-mails is huge. Mindbogglingly huge. And those costs cut out a lot of organizations from being able to operate their own e-mail servers (either the costs of doing it or the costs of verifying to the big players that the e-mail you're sending isn't garbage).
And that's pretty much the story across the board. The old Internet was overwhelmed by bad actors who would ruin everything. Facebook and Twitter house a lot of awful stuff. But can you imagine how bad it would be if we were all still using USENET and IRC?
Is this still true when “telling who's a ‘robot’” is such a common thing to have happen? For instance, I've heard of at least one major platform both sending back quite a lot of UI telemetry and considering third-party clients a violation of their ToS; I haven't heard of strong action being taken yet. (I'm avoiding naming them both because I'm operating partly on hearsay and because I'm more interested in the general question.) Hasn't bigtech had a lot of time and motivation to advance “how to detect people who are using some weird software to talk to us”?
Simpler forms of technical barriers, like with the AIM protocol, were defeated in the past, but it seems like massively upgraded data backchannels, machine learning algorithms, and the new normality of silent automatic updates all the time might strongly favor a centralized defender. Plus IIRC the CableCARD wars didn't go so great, and there were presumably a lot of people motivated to save money on expensive TV packages, whereas risking losing access to all your friends for having slightly better control over something that's notionally “free” anyway sounds like a harder sell.
I don't think it's easy to defeat “socially required tech” + “automatic updates” + “machine learning” at all.
If the legal barrier went away, and someone surmounted the "unserious" (which I doubt) technical barriers to doing that, then the content providers would go out of business. That's arguably a good thing, but I suspect many people would disagree. The problem is the profit motive and financing model for what consumers want, not the degree of decentralization. Google didn't screw up the internet; people did, by preferring what Google offers.
The internet as used by many, many people consists of a few centralized walled gardens. Walled gardens also exist because of network effects.
An open internet is _not_ a technical issue. It's a legal one.
Perhaps it's a social one as well.
There are so many abuse related issues on the web and I’ve seen no decentralized effort that works unfortunately. Cloudflare brought cost effective DDoS protection to the masses.
But that’s beyond the point that I originally wanted to make: spam and email have very similar parallels to security and the Internet. In another universe, it’s possible that the issue of spam and security could’ve been incorporated into the protocols themselves. But for some reason I’m sure is rational, those issues were moved outside, to the hosts — to be left to be solved with middleboxes as firewalls and Google’s spam filter.
I would argue that this was the smarter choice and that both of these involve the same problem: spam and security are ill-defined and constantly evolving.
Anybody who wants to advance the open web should focus his efforts on a P2P library with extremely good NAT traversal capabilities that is extremely reliable and simple to use and supports as many programming languages as possible - certainly not just C++ or C. It needs to be deployable under a permissive license on all major platforms macOS, Windows, Unix, Linux, iOS, Android, and browsers, and may not transport any data or chew away bandwidth without allowing total control over this by the programmer and end user. It needs to have a dead simple, almost idiot-proof API. The resulting network on top of IP needs to be searchable, not too high latency, and route to any endpoint on it.
That's still the biggest hurdle for the Open Web. Everything else is secondary.
The confusion in your comment is that one would need RDF to do Linked Data. I've written about that misconception here: https://ruben.verborgh.org/blog/2018/12/28/designing-a-linke...
Don't get me wrong, the Semantic Web community has made mistakes and has not been developer-friendly. But we're not still stuck in the 90s. For instance, XML hasn't been a part of any of this for many years.
AKA instead of "Eval is Evil" we might instead say "XML is Evil"
With open-science mandates coming from governments around the world researchers are looking for ways to share their data in meaningful ways. I can think of a significant amount of research that regularly consumes RDF, particularly in the fields of medical biology and genomics where it's used to annotate data. This is where I'd guess you'll see it take a foothold, for example medical diagnosis codes are notoriously disparate and there is a strong appreciation for what semantics could address. Unify, exchange, and consume medical diagnoses ... proffit.
Links etc. off the top of my head-
* GO - The gene ontology, used in hundreds of thousands of genomic anotations https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3944782/ * UBERON - https://genomebiology.biomedcentral.com/articles/10.1186/gb-... * The second year of US2TS - http://us2ts.org/2019/posts/registration.html * OBO foundry - https://github.com/OBOFoundry/OBOFoundry.github.io
Where is a good place to participate in those debates, especially data authenticity and local server pods?
FreedomBox didn't go anywhere. FreeNAS with ZFS is reliable but not designed to be exposed to the public internet. Many local services are using a centralized rendezvous server for NAT hole punching.
On the shiny commercial front, MyAmberLife has $13M in funding for a home server but it's mostly controlled by a central cloud service. Do Western Digital, Synology, QNAP, Drobo, etc care about decentralization?
Current solution for several issues related with electronic health records concluded to create the new standard, to use RDF and linked data, which solved most of the issues on the previous standard. See FHIR: https://en.wikipedia.org/wiki/Fast_Healthcare_Interoperabili...
In fact, current linked data discussions seem to me that become relevant again because it is more clear now that we have been misusing/overusing/ bad REST, microservices architectures and GraphQL for some already analyzed and solved problems.
But, of course, for a single application which doesn't require interoperability, not requiring standardized data exchange formats, not requiring support for flexible data representation, Linked Data and RDF will be clearly unnecessary. But on time, the future of data interconnection plays on the side of Linked Data IMHO.
Until now, current attempts to create some Linked Data + RDF alternate infraestructures are more likely to create ad-hoc, informally specified, bug-ridden, slow implementations of Linked Data and RDF.
And just to make sure we are on the same page here: it's not academics' job to build usable products. We will continue working on things that are novel from the academic standpoint; if people like you dismiss LD/SemWeb, those novel things will have "a terrible track record of solving real-world problems". I hope this does not come across as too personal.
And using semantic web for that is just as bad. A basic json API would be much more stable than parsing a document with navigation and similar just to get that data.
Highly unlikely someone will bother with this (in addition to all the other quirks) while making their website.
For example, I can easily fire up a Tor onion service on my never-turns-off home desktop computer and reach my stuff from anywhere. Why can't I reach my friends' stuff the same way? Because, to use business-speak, there's nothing "turnkey". It's something I've been pondering and working on. Sure, the bigger players may have to be in DCs, have more stringent uptime requirements, and distribute their bandwidth/workload more. But for most of us, desktop software and web-of-trust style connections could go a long way so long as the front of the software has a FB feel (e.g. a feed, messages, etc). We can tackle discovery, searching, aggregation, offloading, etc later.
So Microsoft moved skype to a centralized service and have been trying to monetize it since.
The problem with decentralized servers isn't technical, something half as fast as your phone could easily handle distributed versions of popular websites. The arm based "wall warts" were plenty fast, and they are several generations old already.
How could decentralized applications/services be sustainable funded? If not advertising, how? If it is advertising, what's the benefit to users?
Most importantly, why would users care about decentralized vs centralized?
It does seem that a modest arm based server that's silent, potentially integrated into a wifi router would hugely reduce the downsides of p2p networks. Free power, cheap bandwidth, and being part of a p2p network they would avoid long startup times for applications. Users on their phones would get instant access to their data while their local node did any proof of work, DHT tracking, earning the reputation necessary to use bandwidth, cpu, and storage from other peers.
I don't see any technical barriers, just that users wouldn't care, and nobody would want to pay for it.
Probably few, except those actively searching for it. And, especially with decentralisation, the inevitable outcome of being too popular is that it starts to become centralised again, to make things easier.
Besides that, I thought for a moment about Apple's tech. If you and a friend have an iPhone and you're both trying to connect to the same network (and have each other in your contacts, I think), iOS will allow you to automatically share the credentials and connect the other phone too.
I reckon that you'd see a lot of value in that kind of device integration, which is essentially peer-to-peer.
If everyone owned a personal cloud box, security procedures would quickly fall off a cliff, those people are now the new botnet.
Maybe leave people alone, stop sticking our fingers in all the places they might stick to money. Let the geeks take care of their little tribe.
Though its not really simple enough for non-technical people to set up.
In the depths of my soul I would love to re-decentralize the web. I truly believe data centralization will cause people to suffer a lot. decentralized tech needs to solve so many problems before alternatives to centralization become viable. Centralized approaches also improve over time and are a moving target to keep up with.
How any other open source is funded (e.g. corporate/individual donations, grants, crowdsourcing, support, ancillary products, etc). I don't believe, at least for an MVP, that much funding is needed compared the scope of some of the successfully funded open source projects that exist.
> Most importantly, why would users care about decentralized vs centralized?
They wouldn't. And ideally, beyond the annoying hoops on initial versions (e.g. discovery/identity), they shouldn't. Your software needs to win on features. A self-hosted, subscribable Reddit clone w/ chat would be a very good start.
The only one I can think of right now would be Font Awesome 5 (via Kickstarter), which already was a very popular product with big name recognition and had a professionally run Kickstarter.
Pretty much ever open source project only survives because developers donate their own time, or companies allow their developers to do so.
Routers have a few key advantages over most other computing devices owned by the public: routers normally have a public (non-NAT) IP address, they're always on, and battery power is not a concern. If people could install a Tor implementation on their router with just a few taps, Tor usage could expand dramatically. Developers of decentralized social networks might finally get a foothold once the installation problems are gone.
IMHO the best way to re-decentralize the Internet is by creating routers that host arbitrary apps, along with a marketplace of router apps.
The way to bootstrap this idea seems simple: sell a new router with a thin margin, provide SDKs for free, let developers charge what they want for the apps, and take a 30% cut from app purchases. The plan seems so clear that I wonder why I haven't yet heard of anyone doing it. :-)
> routers that host arbitrary apps
Servers. You are describing servers.
Edit: upon reflection, rather a lot of people I know don't have routers, either; they use shared internet (e.g. xfinity) or only have mobile plans.
Your ISP will still know what you are doing, and will have the ability to block you from doing it should the need arise.
I also feel like people are proposing solutions based on technologies as they stand today. Chosen solutions to the decentralization issue need to consider realistic future cases. Just as a quick thought experiment, what happens in the near future, say, 3 to 5 years out, when a huge chunk of people are using 5g technologies as their primary connections? AT&T, or sprint, or verizon, or what have you will still get all of your traffic information in this, very plausible, future case. "It's encrypted." or "It's going through a relay." Is just not a sufficient response to privacy if you wish to have privacy from AT&T. I mean, think about it, chances are, the relay will be using AT&T too.
Which brings me to the big problem with solutions like these, ie - inevitable recentralization. Google, or Facebook, (and now because of how this new decentralization idea works, even AT&T), are in an almost unassailable position to act as the "switchboard" for all of this non-indexed data. Need to know where your aunt, who just moved, is on this new decentralized network? Are you going to ask google? or "WeAreDecentralizationIdealists.com"? Oh, you're going to ask your own node? Sorry, her new information has not propagated to your node yet. Check back tomorrow. Oh forget it, just call your aunt and ask her to give you the new information so you can input it manually.
No, she will need to be registered somewhere to seamlessly communicate changes to her node's connection information. And that "somewhere" will likely be a BigCo.
If we want to replace the behemoths, we need to come up with solutions that are just as easy to use, (easier actually), and avoid any possible blocking or tracking. That requires some very creative people to think radically. Easily blockable in home network nodes I think, are not only not really solving the problems, but are also doomed to failure usability-wise when compared with google or facebook.
Meh, don't need another box, just keep the family desktop on. But even without that, there is still value. The ISP tracking isn't much of an issue using an existing network like Tor. Software starts up, reads the locally encrypted Sqlite DB for your list of "friends'" onion IDs and connects to your friends gRPC services they are hosting on their machines as onion services. Maintains that stream, begin receiving fed data from these other servers (locally caching as you receive which is more ideal in an ephemeral world than live retrieval, but can be a mix of both depending on settings). All without the ISP knowing a thing except that you are connected to Tor.
> Which brings me to the big problem with solutions like these, ie - inevitable recentralization.
Yup. Can't get easily get around this. People are going to gravitate towards what's easier and what they want on the outside ignoring what's on the inside. It happens to most continually adopted standards, even if it's just a more trusted server w/ more uptime. And that's ok, I don't want to win some ideological battle at the cost of user happiness. I completely agree the software must be so easy you can't tell what's under it, but I don't think it requires that radical/creative thinking. Just user-oriented effort instead of the constant barrage of difficult-to-setup tech demos.
Doesn't exist anymore (often enough).
> Just user-oriented effort instead of the constant barrage of difficult-to-setup tech demos.
Things like identity management and data storage make these barriers pretty deep. “I can delete my post and it basically won't be accessible anymore” (there can be physical exceptions so long as they're legibly exceptions to the social reality), “I don't have to think about how big my images are and can just post as much as I want”, “I can lose any of my own hardware and everything will still be there because it's in the cloud”, and “I can tell who my friends are based on common knowledge within my circles of their unique name which is easy to remember and meaningful” are all things that heavily constrain what you can do “on the inside”.
Mastodon has meanwhile managed to either do something right or get lucky wrt the path dependency of building structures where prosocial hosting behavior is convenient: a whole bunch of mostly-volunteer instances have sprung up, adopters have managed to make instance choice part of identity so that the domain-part isn't just a “meaningless extra thing to remember”, and federation remains reasonably strong; meanwhile, financial support for server costs has mostly leant toward the Patreon model, allowing a fraction of generous users to help support a bunch of free riders while not having to directly participate in administration. At the same time, despite Mastodon having almost exactly copied Twitter's model in terms of available user interactions, the zeitgeist has repeatedly suggested that users getting on board for the first time often had no idea what instances even could be, and had to have the very concept explained several different ways before it got real traction. Random instance death is also a problem that's tempering the mood nowadays, because keeping the server up requires enough motivation which sometimes runs out, and some instances have started having problems with media storage requirements, which, see above (though I'm told the internal architecture could use some optimization too).
There's something deeper in here surrounding the thorough conflation of type with instance in the popular side of the digital world; I feel as though something critical to the more literate concept of this didn't make it into the default folk model, such that only centralized services are legible. I have some hope that Mastodon and related ActivityPub-based federated services absorbing waves of people fleeing the abusive behavior of major social media (such as the recent Tumblr exodus) will make a dent in this and cause the appropriate concepts to reach critical cultural density.
I literally do not know anyone who owns a desktop computer any more.
Putting an automatically configured VPN on such a box would be extremely easy, no?
I have family to think about: I don't have time to update my servers every time some new zero-day is fixed. I'd rather pay someone else to deal with those details. My five year old is growing up time with him is far more important that fixing security holes.
Nah, your auto-updating desktop is fine. The software itself might evolve to an hands-off, evergreen-type of approach, but for now just a desktop daemon is fine. The reason keeping servers updated seems so non-trivial is that we visualize it in an ops sense like we're at work.
As confirmed by the lack of complaints about auto update in Windows 10. I mean, its not like updates ever break anything anyways.
Often upload is 10x slower than download.
I think the better solution is to leverage cheap vms. At least performance stands a chance. It's just a matter of making cloud computers accessible and usable to non technical people.
So...centralize many/most popular internet services on infrastructure that provides cheap, reliable VMs. At the risk of overdrafting my snark budget: that sounds familiar.
Home servers are a very difficult sell (see $500 Helm) compared to VMs running in a data center and IMO the privacy difference is mostly illusory.
^ That is your expected reaction by normal desktop users. I mean literally download an exe and up pops up your feed ready to add your friends, or favorite businesses, news sites, link aggregators, etc given their onion ID (yes, onion ID is annoyingly large, especially v3, but discovery/identity comes later, don't let it hold up the system).
I'm not convinced you need a "home server" in the traditional sense. Just accept what you lose, uptime, if you use your laptop or phone to do the hosting. You can share between them too given a synced private key which is the software's job, not the user's. Still, an ephemeral self-hosted-on-desktop social network can go a long way (and again, people will let the need for uptime drive their always-on desktop decision). This stuff requires such little resources to start, a cheap Raspi w/ a install/reach-from-other-device would work just fine if they don't have a home computer and want one just for this. Large storage can come later.
I do agree the privacy difference is minimal.
The "normal desktop user" should probably not be running their own self-hosting setup, because they will fail at backups and reliability and performance.
There are challenges like establishing initial connections, or push notifications, but these can hopefully be worked out.
This same thought has popped up in my mind as well before... I think it's a probable future only if these devices are discrete plug-in and forget machines and generate revenue for the owner.
I'm running Solid on my own box, and I can't see myself doing it any other way, but it was pretty hard to set it up. We need to change that.
I think you radically overestimate the desire and ability of the average computer-user to consider their devices' uptime.
https://news.ycombinator.com/user?id=cordonbleu
this is handy as well.
Just look at ActivityPub. It's essentially OStatus but instead of XML we slapped namespaces on JSON, wrote a bunch of overly complex preprocessing procedures so that everyone can output just the way they want[1] and still made half the spec ambiguous enough[2] that implements essentially follow the one rule that matters, maintain compatibility with Mastodon.
[1]: https://www.w3.org/TR/json-ld-api/#algorithm-5
[2]: https://please-just-end.me/ap.html#block-activity-outbox (domain name relevant to content)
https://github.com/kazarena/json-gold/blob/master/ld/api_nor...
Nothing proves his point more than:
<script src="//www.google-analytics.com/analytics.js..."Al Gore claims he's an environmentalist, yet he flies using his personal gas guzzling plane."
And yeah, it's an ad hominem. "Is a fallacious argumentative strategy whereby genuine discussion of the topic at hand is avoided by instead attacking the character, motive, or other attribute of the person making the argument"
You can most definitely use centralized servers to disseminate decentralization. I'm pretty sure https://ipfs.io started as a github project and on a monolithic server backed by cloudflare. That doesn't dismiss them in the least - considering they now heavily dogfood.
Yes, I track how popular what content is on my site. Motivates me to write more. Please feel free to block trackers; I do that as well.
I don't care if people run Analytics, its just when they go sharing all that data with a third party that it gets troubling.
You can talk about it all you want, but as long as you continue to participate in the very thing you rail against, you're going to struggle to be taken seriously.
A) know how the sausage is made, and
B) stand to actually lose something if people believe them, i.e. they are standing by their point in spite of the negative consequences to themselves.
People on the dole who argue against (details / implementation of) social security should be taken seriously. Rich people arguing against tax breaks for the wealthy. Programmers against big tech firms. Etc.
> "But look, you found the notice, didn't you?" "Yes," said Arthur, "yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying 'Beware of the Leopard.'"
First, this idealistic idea that "we" are going to take back our data. Who is this we? Only the smart, high-agency people who have time to spare. The commercial web is increasingly tuned to the normal user, who is low-agency and easily led around. Who will win a battle of user acquisition and retention? Facebook or the rebels? Facebook of course. So any solutions proposed here are just for a tiny percentage of users who will then be isolated from the real and useful social networks. Or more realistically use both.
Or maybe if the infrastructure is built, a layer of savvy entrepreneurs can emerge to monetize it? I'm thinking of reaganemail, selling an anti-google email account to the AM radio crowd.
Second, the idea of somehow eliminating censorship. De facto censorship will always exist, even if you sugar coat it as Twitter has tried - "your content is still there, but only if someone explicitly looks for it". Any platform without censorship will just be flooded by every marketer and political zealot, for starters.
Also, I think he is conflating filter bubbles with centralization. Without centralization, wouldn't we still have filter bubbles as people self-select into their online communities?
Supposing we manage to solve this problem, what's to say average people can't participate in 10 years or so or so when the tech has been made easier to use?
It didn't start centralized. Centralization happened. I might be more cynical than I should be but as a designer I struggle to see the future in which we have social dynamics that favor decentralization instead of convergence into a less self-managed system (i.e. all current centralized networks).
This sort of discussion looks often like rose tinted spectacles. The past wasn't so different to today.
Perhaps, but those would be self-selected, not imposed by the provider. Big difference.
Don't get me wrong, I'm all for projects like this. I think it's wonderful. I just never really got how the apps will work with the same data without being forced into a particular data model (which seems like it would limit what you could do).
So, basically, there is one data model, RDF, but RDF does not require the same set of fields, to the contrary you are free to write your own. Obviously, you wouldn't get good interoperability if you do. So, there are several things you can do:
1) Adopt what others are using 2) Map your "fields" (we're more for calling it vocabularies), to the stuff others are doing, and rely on apps to figure out interop using reasoners. 3) Don't care, your app will work fine for you.
I mean, 3) is fine, it is just that you'd be missing out. 2) also works, kinda, but reasoners aren't all that easy to use, so I'd mostly like to see people go for 1).
So, we need to make it really easy to find existing stuff. You could go for the big one, i.e. https://schema.org/ or you could go more in detail and look at https://lov.linkeddata.es/dataset/lov/ . The former has a lot of traction, the latter is real decentralized, so I kinda prefer that.
Then, we have to make it real easy to author new stuff when you can't find existing stuff, because that will happen. Then, we need to make it easy for others to find yours, so that they can start using it too for similar applications. And, I'm thinking that it will be kind of a graduation process, where you first look for existing stuff, and when failing to find anything, you just mint your own without thinking about others, just to get something that works up and running. Once your app starts gaining traction, you tighten it up, and if then something other gets popular, you can migrate to that with little disruption.
So, we're not there yet, but we're thinking and working on it a lot.
I'm fairly confident that 98% of the population of the earth doesn't give a crap that their data is collected, or that they don't "control" it. This whole "decentralized web" thing is just privacy nerds trying to convince us that we need this, when really no regular consumer is asking for it.
I encourage you to read the article, where you'll see that I'm arguing from a permissionless innovation perspective, not so much privacy.
Plenty of services are API compatible with Amazon S3 (e.g. anyone can run their own S3 clone) so people can modify existing sites to use S3 with OAuth. Use OAuth to allow the user to delegate access to their S3 service link. No new protocols needed, no big innovations required.
But for this to work on anything other than the most rudimentary data (media files, blog posts, and serialized data) would require completely changing the way all modern applications are written. Databases would all have to change, APIs would all need to follow specific standards, and networks would need to become a hell of a lot more stable, higher bandwidth, and lower latency.
Assume you're Twitter, and you want to map-reduce all of the data of all your users to find out how many people retweeted a user, and then notify those users. Now you need to connect to every user's service provider, get their data, store it temporarily on your own servers, duplicate everything, do your processing, and then write changes back to all storage services for all users. Now do this every second. If you don't, you have to store this map-reduced data on your own service's storage, which violates the principle of only using the user's storage pod.
In fact, data would have to become more centralized to work in this model. Currently, application data exists across a range of services in a variety of networks, all of it being dynamically accessed in different ways before it is accessed by a user. There are dozens of different databases used just to open up the TV Guide on your cable company's set-top box. All of that would have to be centralized in one or two databases in order for the storage and processing to be disconnected.
Not only that, but a lot of data is useless to anyone but the original service provider or original application. Only a Facebook clone would be able to use Facebook's data, and only data relevant to Facebook's ad sales should stay on Facebook's servers, even if it contains "Peter clicked on ad X at Y time". Should there be a separation of what kind of data gets decentralized? Do we really want to go down the rabbit hole of what is my data, and what is data about me that a company has originated and created value from? (Is a picture mine because it's a picture of something I own, or is it mine if I took the picture?)
The idea that every component of every application could be completely decentralized from each other is unlikely. Now, what is more in the realm of possibility is doing a Google or Facebook, and creating features that allow exporting or importing all data. But that process is not perfect, and the procedure can take from minutes to days. And to use this data it would still all have to follow standards specific to a particular application.
And again, we already have a lot of these data standards. We have standards for most of the kinds of data that exist today, such as calendar, contacts, e-mail, instant message, voip, office documents, images, and so on. We have standards to synchronize and syndicate data feeds. We have standards to federate accounts and manage permissions. But commercial sites don't natively build these features as interoperable with each other - because, why would they?
Storage and processing of data are intimately connected with the specific applications that use them, and trying to decouple them will result in inefficiency and complication, with no clear advantages.
Your TV Guide is a good example of things that aren't hard. They don't change very quickly, so you can just use a cache. That's easy.
Finding the number of RTs, that's also easy, apart from it being an open world of course. When they RT, they notify you. And you want to display those RTs with your tweet? Just cache those who notified you.
Stable data access standard? That is Solid itself. And the data model, that's RDF.
There are ways that you can go about doing this stuff.
Finally, we're also getting some traction around this in academia, they've been hung up in stuff that isn't helpful for too long.
We have parallels from other platforms - specifically the fixed and mobile phone networks.
There used to be monopolies in local phone service. There were new competitors, but to change provider, you had to change phone number.
Even changing cell phone provider required a number change.
This obviously had strong network effects pulling you to stay with your provider. You had to tell _everyone_ in your extended network where to find you and have them update all of their business records when you changed from one carrier to another.
Eventually, everyone figured out this was stupid, and Number Portability [1] was forced on carriers by regulation.
This problem is completely gone now. You can take your number with you.
If we allow people to take their data to new social networks, and force federation, then we will get decentralization. However, it won't happen without regulation anymore than it did with the phone companies.
[1] https://en.wikipedia.org/wiki/Local_number_portability#Histo...
The government then says "You have to allow competitors access to X, and you have to do it by date Y".
Then the companies get together and agree on how to do it because they agree that government dictated standards suck. There is usually some jostling around with someone wanting to run a centralized database for a nice per-transaction fee. Typically this is tossed out in committee, but not always.
*disclaimer: I help develop Bunsen Browser, the mobile companion for Beaker Browser.
Choosing between service providers is no more meaningful for privacy than asking Windows users to download arbitrary apps. If smart phones are any more secure than desktops, it's because Apple and Google are constantly improving OS-level security and policing their app stores for malware.
Of course app stores have well-known flaws. But if we want to do better than that, someone has to figure out a better way to choose good rules and enforce the rules better.
It's a p2p caching proxy that also lets you edit web pages collaboratively in realtime over a LAN or the internet. It has a contacts list system and p2p chat functionality. This project effectively died due to lack of interest and I still have various security concerns about it (Should you break/reimplement Same-Origin policy or break/reimplement the TLS chain of trust?)
The main security concern is that because it decentralises HTTP in-place (existing URLs can now be looked up on an arbitrary number of overlay networks if the original URL isn't providing an OK response) it puts users at risk of malicious actors spamming overlay networks with browser exploits for popular resources like "news.ycombinator.com/".
I hope TBL and co converge on satisfying answers to these problems or constrain their design to not bother with decentralising existing URLs in-situ.
Code lives here: https://github.com/Psybernetics/Synchrony
Feel free to shoot me any questions.
From what I understand the proposal here seems to not allow for the advertising model. I don't think a services can grow and survive making people paying because people are too cheap.
There might be a better chance for something like this is they allow for the economics. - Maybe the data host can provide a "advertising" profile which the user has control of. This can be exposed to the application hosts to allow for advertising. - Maybe you also throw micropayments into the mix, along with bartering for information or micropayments.
Another issue is complexity. A number of comments have talked bout over-engineered solutions and protocols. This decentralizezd idea could be started with something small like an open social network standard. I think I saw something similar to this on HN not too long ago: - You have a web site, which is your profile. A provider could give you a nice editor for it. - You have a feed, where you can put pictures, short posts, long posts, whatever. This is distributed with RSS. (The host makes this all seamless for you.) - Identity is controlled with OAuth, used only to give an identity to visiting users. The owner users can manage permissions for certain remote users (his "friends")
Such a service could be managed on your own web server, or there could be different cloud providers that make this arbitarily easy, with arbitrary levels of functionality on the "profile" page, the "feed" and the "friend" permission management.
This whole article looks like "well, the obstacles are not technological, but let me write a few pages about technology anyways".
If the obstacle are not technological, then we need non-technological solutions. So far I think GDPR is one such non-technological step towards taking back control of our personal data.
The hardest problem in my opinion is "preventing the spread of misinformation" because we essentially need a way to distinguish between malice and stupidity. Without mind-reading I do not see how this could be possible at scale.
The crucial point is that Solid will bring more choice: there will be social feed viewers that will be more invasive, and those that will be less invasive. People can choose the one they like, without consequences as to whom they can interact with. Today, we do not have a choice: if we want to interact with people who use Facebook, we have to use Facebook as well.
There are stronger alternatives. We need to make a push to begin using them.
You then need an alternative name system which links a unique human readable name to a public key. This is the tricker part (see Zooko's triangle), but there are some creative solutions like Namecoin and the Blockstack Name Service.
Easy: use DNS, store the PGP key ID a TXT records, and then look up the public key for that ID using a PGP key server.
I'm pretty sure there aren't better alternatives.
In practice, I am satisfied with just using my own domain for email, my web site, and self-hosted blog. For communication I like FaceTime so I can see people while I am talking with them, phone, and email.
I still use social media, very occasionally, to see what people are doing and sometimes advertise my new open source projects and updates, and any books I write. Most of the problems people talk about with Facebook/Twitter don’t bother me as long as I only use the systems infrequently. I am not tempted to cancel my accounts.
Ask yourself: who is this for? People who are not already deeply passionate will stop reading unless they are engaged in a minute of reading. Note that a minute is being extremely generous; on a commercial consumer site, it's apparently an average of 7 seconds before someone will click away.
I recommend that you check out this video and reconsider how you might reframe your message as a call to action that speaks to a better future we can create together.
https://youtu.be/qp0HIF3SfI4?t=121
I even jumped you to the good part.
If we still don't have decentralization, it's because it is not as easy.
The solution involved running a mesh network with nodes on user's laptop or desktop and a corresponding node in the cloud. These nodes would index local data and provide replication of metadata across nodes and backup of actual data to cloud node.
A locally running web app acted as replacement for 'windows explorer'. It allowed the user to access all their files and folders across all their nodes, access them (open document, play music/videos, see contacts etc), create smart collections and share these files, folders or collections with other users in a secure authenticated and private manner.
User got an identity - which comprised of a dedicated domain (or subdomain) and a PKI certificate tied to that domain. Each node had it's own private key and their public keys were tied together by the identity certificate.
All communication between nodes (of same user or across users) where authenticated and encrypted using these identity/node keys and certificates. No central node existed in the system that could spy on these activities. The architecture separated the network discovery cloud nodes from your data cloud nodes and architecture allowed for your data cloud nodes to be hosted separately anywhere (say, in your own cloud instances).
This is the only system I have seen that utilized zero knowledge protocols and made it accessible to common people to manage their data and share with others as well.
But unfortunately, as a business it never took off. It got acquired by emc and merged with mozy (good old data backup company) and then this product died a silent death in 2010.
Maybe it was timing, maybe after snowden, if this product had launched it would have done well.
But now, I think a more urgent and a relatively less complex problem to solve is one of distributed communication. In this era of always connected powerful devices (mobile phones, home gateways), why don't we all have our own personal email/chat servers that nobody else can spy on? Why does email and chat have to get relayed via big aggregators who mine so much data as well as metadata?
Not only do they violate privacy, they succumb to security breaches and cause serious damages.
I feel the stage is set for this disruption: crypto protocols, always-on cheap connectivity, compute power at the edge, and sensitivity to privacy/security in general population – all of these ingredients are appropriately set right now for this to happen.
> unfortunately, as a business it never took off
Sounds like the timing was too early in 2005. I believe these days we're all so tired of the privacy and security situation, that the world is ready for something like this.
> [it] utilized zero knowledge protocols and made it accessible to common people to manage their data and share with others
This describes exactly what the re-decentralized web needs.
Seeing the many attempts over recent years, it looks like there are significant technical, financial/business and social challenges - but I totally agree with your conclusion, that "the stage is set for this disruption". It also feels like the tide is rising, that the solution is being worked on from numerous fronts and eventually a more evolved system will be adopted by the public.
Maybe this is a lesson that we need to be less tolerant towards the creation of centralised services because those with money and power will seek to bring decentralised systems under their own control.
- GPU passthrough VM (gaming)
- SATA passthrough (FreeNAS)
- multi NIC passthrough (pfSense/OpenWRT)
- app server/cloud/P2P Linux or FreeBSD VM(s)
http://unraid.net sells a KVM-based product. VMware ESXi and XenServer are free. Connect a Ubiquiti AC-Lite WiFi access point to a dedicated NIC on the x86 box, WAN to another NIC. Since pfSense owns the WAN NIC, it can host a VPN server for your devices, including mobile. All VMs get virtual NICs. Dell T30 with quad-core Xeon and ECC costs about $400 with 8GB RAM and 1TB disk, it can hold 4 x 3.5" drives (20 TB in RAID-1) and 2 x 2.5" SSD.Level1Techs has intro videos on home servers: https://www.youtube.com/results?search_query=level1+home+ser...
Advantages:
- Stable and boring x86 platform
- Good performance for gaming
- Commercially supported hardware
- Upgradeable storage and GPU
- Upgradeable router softwareWatching your TED talk in 2013 was one of the most influential moment in my life, and discovering the semantic web was perhaps my greatest epiphany. While the vision never left my mind, I never acted on it. Until now.
I'm dedicating 2019 to linked data. I'm going all-in.
Last week, I started to build a tool to convert unstructured input to linked data. Even after recognizing canonical literals (email, phone, url, color, gender, boolean, integer, float, date, time span, money, weight, distance, language, image, geo coordinates), I couldn't accurately infer predicates and guess classes. Before trying more complicated stuff like bayesian inference, I decided to try a simpler exercise.
This time, I want to aggregate structured data from different sources and map it to some existing ontologies. For example, I want to convert some JSON about comments and links from Reddit and Hacker News to RDF using the http://schema.org vocabulary.
- Can I feed the JSON into some ML system that automatically figures out the mapping? What if I provide some annotation or feedback?
- Can I manually turn the JSON into JSON-LD and use that as the mapping information? What about complex transformations (different structures and literals)?
- Should I implement the mapping manually using my favorite programming language?
- Should I use R2RML or RML?
What's the state of the art today for semantic data integration?
- Homepage http://wit.istc.cnr.it/stlab-tools/fred/
- Paper https://www.researchgate.net/publication/280113533_FRED_From...
There are likely other projects and papers, google 'text to rdf nlp'
Stephen Reed (ex-Cyc engineer) also did some interesting work in this field, in his Texai project, over 10 years ago. Although there are few references to it on the web now: that part of his project is no longer open source (and I know of no known mirrors).
- Paper https://pdfs.semanticscholar.org/8026/107de65c5a14aa8d0d47f9...
- Homepage http://texai.org
- http://homepages.inf.ed.ac.uk/kbyrne3/docs/thesisfinal.pdf
- https://www.researchgate.net/publication/228378264_A_very_br...
For decentralization the root problem always existed, while pointing at another resource requires no permission, receiving and hosting that resource does. Your government has to let you receive it and your ISP has to let you host.
This is a much lower level problem compared to the three challenges Berners-Lee puts forward, which seem to have little to do with decentralization.
1. taking back control of our personal data;
2. preventing the spread of misinformation;
3. realizing transparency for political advertising.
What about Google Chrome?
Facebook, probably not so much. Their business model is data harvesting.
Regarding Solid, note that we don't want to overthrow or replace any existing social networks. We start with offering experiences they cannot offer due to their siloed nature.
Pretty sure Chrome did. Or WebKit/Blink family. This is GOOD imho.
Such a centralization comes with the risk of websites only working with one browser, forcing people to chose a certain device, operating system, and browser vendor.
Regular people and businesses are always going to make the decision in front of them.
'Decentralization' unto it's own, is not something anyone directly cares about. People care about privacy, somewhat, but there are other paths to privacy, or at least, consumers may very well believe there are.
Decentralization will only happen with a real impetus: a product or service that facilitates it, that people want, either for issues related to decentralization, or, more likely for some other reason that just happens to facilitate decentralization for some other, related reason.
In both cases, DNS and TLS CA-based stuff is about trust. You need to trust the DNS server, as there could be malicious servers sneaking in, and you need to trust the cert.
But once you have a social network with a large strong set, you could base the trust on the strong set, and in particular, individuals in that strong set who can demonstrate that they have a clue.
Once we have that, we can get rid of these achilles heels, but quite frankly, I don't believe in a strategy that takes on those problems first.
Sure, I obviously got OpenNIC in my DNS resolution. Haven't once seen an address that required me to use it beyond when I set it up. I think our approach is much better. Base it on people and the strongest part of their network.
Disclaimer: building a registrar[2] for Handshake so we're pretty excited about it!
(But they have a key signing ceremony, so we can trust that it is secure.)
There are alternative DNS roots, though. I participated in running one myself for a time.
1. https://en.wikipedia.org/wiki/ICANN
2. https://en.wikipedia.org/wiki/Root_name_server#Root_server_s...
> The situation becomes problematic when we are robbed of our choice, deceived into thinking there is only one access gate to a space that, in reality, we collectively own.
Robbery - the action of taking property unlawfully from a person or place by force or threat of force. [0]
Deceit - The action or practice of deceiving someone by concealing or misrepresenting the truth [1]
That's what those words mean. They also have nothing to do with anything that has happened with the internet over the last 20 years.
Are you railing against the use of "rob" with an intangible noun? Would you cry foul at phrases like "robbed of their dignity?" Do you ignore alternative definitions like "to deprive of something unjustly or injuriously?"[0]
Do you believe that nobody involved in centralization conceals or misrepresents the truth? Does a marketer never overstate the benefits of their hosted solution?