Pretty damn powerful. The next level above this is a search index that lets the user generate their own results, using their own machine learning algorithm or ranking weights of their own preferences, bc they would have direct access to the DB index and features. People could wrote anti-ad plugins. There could be foss upgrades all the time. Nobody would have to spend a particularly crazy amount of money on storage if they could all just cache the bits they’d needed themselves. Quite remarkable imo!! Whatever ends Google search’s reign, will probably be user-owned in a way that seems a lot like this..
If you want to use TOR and onion sites, I don't think this really adds anything to those.
I think this just helps you optionaly crowdsource bandwidth.
You don't expose your data layer directly to the consumer, you expose an api that will resolve to one, two or several databases from one or many peers. The indirection allow you to define your rules and use your data layer in a way that fits your application goals in the best way possible.
So whats is immutable is your application, api and initial data, which you can mutate at later stages through other torrents or by consuming other api's from other peers and mutating your initial database state.
The problem is, the current browser is not meant for this, Javascript is not meant for bigger and complex applications (of course you can do it, but..)
Given tech like described in the linked article, you could reasonably lookup such a function from a peer-to-peer network.
> each function is identified by a hash of its AST.
That encoding for mobile code only works if the expression has no free variables (i.e. is closed).
It turns out that it is surprisingly difficult to write code which you can be sure has no free variables at particular points (the "send this code to another machine for execution" points). Especially in functional languages. If you write a higher-order functional, you know nothing about the closedness of its argument. If you require that all function arguments be closed it breaks all the useful functional programming techniques. The only way to make it workable is to check closedness only on those values which need to be mobile.
If you do the closedness-check at runtime, toy examples will work fine and the technique looks quite powerful, but once your codebase gets to a meaningful level of complexity it turns into a game of whack-a-mole with closedness heisenbugs. You quickly discover that closedness is data-dependent.
In order to check closedness statically you need quite a sophisticated system of modal types. MetaOCaml was the most usable result of all this research:
https://okmij.org/ftp/ML/MetaOCaml.html
The important upshot here is that this isn't just some check you can slap onto the compiler. The programmer has to think about these types, and craft them carefully, as an integral part of the programming process. Basically you aren't just writing a program, you're writing a program plus a proof that it won't try to mobilize open code. Writing formal machine-checked proofs is not something that most programmers are good at.
Anyone who shares content (whether a file or a byte) exposes themselves in one way or another and can be targeted. Anyone else remember the “good old days” when torrent sites stood up in the face of the law and yelled “we host nothing illegal because we host nothing!” and they were right. Then the law firms started joining the swarms to collect IP addresses of everyone that shared with them.
Now they rarely bother trying to sue IP addresses. The legal ability is still there, it just turns out it’s not worth the cost to sometimes collect a few hundred dollars from someone who can’t afford it.
When we talk about transformative tech, we have to immediately consider the “standard scenarios” because being blind to them is how most technology trips and falls. Pirating (especially live PPV events), illegal porn, utterly violent videos, information on making weapons, letting children access same and/or mixing them in with adults in unsafe and unsupervised spaces, openly anti-government statements in countries where they make people disappear, and so-on.
These things happen. And they happen faster and more intensely than most programmers can handle. The only way to keep tools like this viable is to build the tech with those in mind.
But that's also the fatal flaw. It's pretty definitionally true of stuff on the web that the person who owns / controls it wants to be able to change / update it. That isn't really supported at all at present for this distribution method, and it's hard to see how it could be in the future without introducing a single choke point.
Case in point: the vast majority of torrent traffic is for new torrents, because people want to download the stuff that's just been released, not the stuff that they either already have or could have had months ago. You lose 100% of this traffic with this "p2psearch" method, because the database can't be updated. Or if it can be updated (stick it on a traditional website for people to mirror), you rely on everyone updating to the newest version of the database.
It's also incredibly slow compared to traditional sites. It took longer than it took me to type this comment to see a single result for the popular title I tried to search for.
however, as implemented, all that is needed is for the user of p2psearch to refresh and the browser to pick up the latest database. i imagine most users are not keeping torrent search open 24/7, so this doesn't seem onerous.
it's probably a bit of a process for the host of the frontend to update the database, prepare a new torrent, update the code [0], and then rebuild the bundle regularly, but this could be automated.
regardless, it doesn't seem so unreasonable from an end user perspective, and i personally don't mind if my torrent search index is behind by a few days.
[0] https://gitlab.com/boredcaveman/p2psearch/-/blob/main/src/Me...
You get other features with it for free. Popular chunks will be easier to obtain as many have them and preservation starts with rare pieces. You'd want common use to be fast and uncommon to complete eventually.
If a new torrent uses the same folder name the new files will appear next to the old ones but if so desired the old files may be duplicated into the new torrent.
Its not IPFS but it works.
If you can get a person or organization to sign off on the data I'd say its a feature rather than a bug?
Let the professor publish his data set and let interested parties store it without much effort.
In the Napster/Torrent era once I found a video, document or audio I was interested in, it was very easy to find again without having to worry about archiving it myself.
So sure things are "unkillable", but depend on someone/somewhere decided it's worth storing that particular chunk. Seems strange to pair "unkillable" which depends on someone somewhere that "might" do something.
Much like how IPFS was easily oversold, great you can find things hosted anywhere on the planet, but if you publish 1M files and expect to magically seem them hosted elsewhere a year later you are likely to be disappointed.
I do wonder if it would be a better approach to replace filecoin or similar complex trust relationships with a simple peer to peer trading program. Something along the lines of "Lets trade 128MB", trust but verify, then watch uptime/availability. For clients up less than a month, give them the free 128MB and watch, 1-3 months trust them enough to store data with a 20x replication, 3-12 months 10x replication, over a year 5x, whitelisted peers of friends/family 3x.
In my case i've implemented a new "browser" based on Chrome that allows this to work, without having to resort to browser-only infrastructure (for instance applications can dodge Javascript and also call RPC api's from other applications directly).
The applications and the applications data are distributed over torrent and managed to work together in the same environment as a flock where one app can consume its own apis and also the api's from others.
service Search {
rpc doQuery(string query) => (array<string> result) // the access to the sqlite db from torrents is encapsulated here
}
The beauty of this design is that it can also be re-scheduled and have the same request routed to other peers---
This posted link could be another piece of the jigsaw (Solidity, IPFS, IPNS, ...) that I think will come together to make interesting apps in the future. Solidity doesn't mean necessarily spending $100 to make a function call - there are other chains, off chain stuff being developed, and you could host a private chain for your app.
While none of this stuff can do something new you can't do with Postgres - you can create more open and perhaps 'honest' applications where everyone can see the data-engine and understand it. So it is more of a cultural shift. For example, if you make a Twitter this way, you don't need to rely on an API. The data is there for everyone, all of it.
For example, take Uniswap. It's not a company like Facebook, it's an open protocol. Swapping tokens is now functionally open source because of that. There is no "Facebook of swapping tokens". And this can be used as a building block for other apps. Not necessarily just "gambling" or "trading" either.
Someone appears to have built at least part of it.
One of the flaws that traditional torrents suffer from is if all the downloaders want 1 file out of 10 and thus seed 1/10, the torrent can ever again be fully downloaded if the last seeder with 100% drops off.
I suspect this is not an issue for the demo since the DB seeding is probably being monitored, but as a technique? I could definitely see chunks of a DB not being available due to no one seeding 100%
Barring all that, non-default browser settings (i.e. disabling WebRTC or TURN).
Looking at Chrome's Inspector, Application Storage has "Local Storage", "Session Storage", "IndexedDB", "WebSQL" and "Cookies".
If I decide to clear all this stored data stored for all the pages with one button press in the settings, how can I be sure I'm excluding something important? Or if I install a new extension which helps me manage privacy, how can I be sure that it won't delete all this data because I forgot to add a site to its whitelist?
However, I do agree that relying on regular web browsers is not a good idea. A browser update could limit resources of background tabs even more than they normally do and soon your application is down.
But it literally nukes everything, including saved passwords.
https://docs.ipfs.io/concepts/how-ipfs-works/#directed-acycl...
It's like asking why we need filesystems and network cards if all we're trying to do is show lights on screens. Obviously not all patterns of lights are as easy to produce on computers.
I did a bit more research last night, and discovered that Bitfinex actually does something like this internally (anyone know if this is up to date?) [0] — they built a service discovery mesh by storing arbitrary data on a DHT implementing BEP44 (using webtorrent/bittorrent-dht [1]).
This seems pretty cool to me, and IMO any modern distributed system should consider running decentralized protocols to benefit from their robustness properties. Deploying a node to a decentralized protocol requires no coordination or orchestration, aside from it simply joining the network. Scaling a service is as simple as joining a node to the network and announcing its availability as an implementation of that service.
At first glance, this looks like a competitive advantage, because it decouples the operational and maintenance costs of the network from its size.
So I'm wondering if there is a consistent tradeoff in exchange for this robustness — are decentralized applications more complex to implement but simpler to operate? Is latency of decentralized protocols (e.g. average number of hops to lookup item in a DHT) untenably higher than that of distributed protocols (e.g. one hop once to get instructions from coordinator, then one hop to lookup item in distributed KV)? Does a central coordinator eliminate some kind of principle agent problem, resulting in e.g. a more balanced usage of the hashing keyspace?
Decentralization emerged because distributed solutions fail in untrusted environments — but this doesn't mean that decentralized solutions fail in trusted environments. So why not consider more decentralized protocols to scale internal systems?
How is this the case if earlier there is a "limitation of not being able to ask ONLY for pieces I'm interested in"?
Is there any relationship between these two projects, or are the similarities only incidental?
IPFS / Filecoin
Dat / Hypercore
PouchDB / CouchDB