Now I can do this myself, because I'm all "+337" and what not. (I used trickle as described in the comment[2]. Seemed to work fine. Nice and stable.) But I can't recommend Joe and Jane Consumer to install IPFS and some other thing with a straight face, because they'll say, "Well bittorrent can do it!?" and I don't have a good answer.
Maybe there's an opportunity there to rent IPFS VMs to normal people? I dunno.
https://github.com/ipfs/go-ipfs/issues/3065
https://github.com/ipfs/go-ipfs/issues/3065#issuecomment-415...
So, while we get there, applications/tools can also manage their bandwidth, so the default settings are more gentle with using users bandwidth.
[1] https://github.com/ipfs/community/blob/master/code-of-conduc...
We do not and cannot control the data that each individual node is hosting in the IPFS Network. Each node can choose what they want to host - no central party can filter or blacklist content globally for the entire network. The ipfs.io Gateway is just one of many portals used to view content stored by third parties on the Internet.
That aside, we definitely don't want to 'apply copyright worldwide'. For one, it's not consistent! Trying to create a central rule for all nodes about what is and isn't "allowed" for the system as a whole doesn't work and is part of the problem with more centralized systems like Facebook, Google, Twitter, etc. Instead, give each node the power to decide what it does/doesn't want to host, and easy tooling to abide by local requirements/restrictions if they so choose.
The IPFS network itself is decentralized, there's no central authority that can police copyrighted content on individual nodes.
IPFS is not trying to be an anonymous file-sharing service afaik.
The permissions thing, we can all agree if it's a new movie that just came out, ok, yes, don't be spreading that.
What if instead it's an academic journal article from 1930 in a publication that ceased operating in say 1940? You also don't have permission for this and it's still also under copyright.
The strict interpretation would be "not that one either" while there's also some who say "it's ok, there's nobody to even ask, let historians do research".
So some prefer to be grey about it like many are with obscenity and pornography. We don't for example, hide renaissance paintings with exposed beasts away from the public in basements for fear of getting shutdown by the police.
There's a spirit of the law as well.
Todays IPFS is a long way from that - any Joe Random can DoS any particular hash by getting their node at the right place in the DHT and blackholing requests.
Can you explain more how this is possible?
So we have one evil user Karen who wants to block access to content ABC.
She will spam the DHT with requests to content ABC. After a while, nodes will stop responding as she hits the rate limit. Now her DHT requests goes into the void.
Now Joey wants to request content ABC too. He requests the content, and because no other nodes are responding to Karens requests, they responds to Joeys request for the content. Now he can fetch the content.
There isn't one.
> any Joe Random can DoS any particular hash
Sounds pretty decentralized to me!
Key problems in this space:
- decentralized is something techies obsess about but that has as of yet no business value whatsoever. Customers don't ask for it. Centralized alternatives are generally available and far more mature/easy to manage. The business ecosystem around IPFS is basically not there.
- this space is dominated by hobbyists running stuff like this on their personal hardware doing this mostly for idealistic or other non incentivized (i.e. money) reasons. Nothing wrong with this but using it for something real brings a few requirements with it that are basically hard to address currently.
- Filecoin has been 'coined' as the solution for this for years but seems nowhere near delivering on it's published roadmap. Last time I checked it had undelivered milestones in the past. As of yet this looks increasingly like something that is a bit dead in the water / a big distraction for coming up with better/alternate solutions.
- Uptime guarantees for content are currently basically DYI. Nobody but you cares about your data. If you want your content to stay there, you basically need ... proper file hosting. As incentives and mechanisms for others to agree to host your content (aka pinning in ipfs) are not there, this is hard to fix.
- integration with existing centralized solutions is kind of meh/barely supported. We actually looked at using s3 as store for ipfs just so we could onboard customers and give them some decent SLA (bandwidth would be another issue for that). There are some fringe projects on github implementing this but when we looked at it the smallish blocksizes in ipfs are kind of a non starter for using this at scale (think insane numbers of requests to s3). This stuff is not exactly mainstream. Obviously this wouldn't be needed if we could incentivize others to 'pin' content. But we can't currently.
I don't really understand that point. If it works, then it has the business value of saving you all running server and maintenance costs. For most larger businesses these costs may be insignificant and easily recovered, but for small businesses with a lot of customers they can make a huge difference. For example, I'm looking into P2P options for implementing a decentralized message forum in a game-like emulator and it would make no sense to even implement this feature with constant running costs for server space.
Now getting the decentralized data management to work reliably out-of-the-box from behind various firewalls and different platforms, that's the big problem. So far, none of the libraries I've seen are very easy to use, some require a difficult installation and configuration or you need to your own STUN server or gateway, which kind of defeats the purpose.
You absolutely can incentivize others to pin content. Check out Infura (infura.io), Pinata (pinata.cloud), Temporal (temporal.cloud), and Eternum (eternum.io) - these are all services you can pay to host your IPFS data with reasonable uptime. They have an incentive to keep your content around because you pay them to. Filecoin is a distributed solution to that (and making active progress - they have a testnet out with over 5PB of proven storage: https://filecoin.io/blog/roadmap-update-april-2020/), but you don't have to wait for that to land.
Resilio (originally part of BitTorrent) specializes in decentralized storage synchronization for business.
Though we can't see if they are profitable their business use cases seem compelling especially if you're reluctant to otherwise use the cloud.
https://www.crunchbase.com/organization/resilio#section-over...
IPNS performance was abysmal last time I tried to use ipfs (took > 30 seconds a fair amount of the time). Curious to see what it's like now.
Since node IDs are random, node lookup may require multiple interplanetary hops, no?
For instance, from N1 on Earth to the nearest XORwise node in its routing table which might be N2 on Mars, whose routing table finds the target node N3 on Earth.
https://docs-beta.ipfs.io/recent-releases/go-ipfs-0-5/featur...
My impression is that DHTs fall down pretty hard under the latter scenario and are also pretty vulnerable to the Sybil scenario if the attacker has enough resources to mount a really serious attack. They're okay for low-value simple stuff that doesn't have much of an intrinsic bounty attached to it (like BitTorrent magnets), but trying to put a "decentralized web" on top of a DHT seems like a scenario where the instant it becomes popular it will get completely shredded for profit (spam, stealing Bitcoin, etc.).
My rule of thumb is that anything designed for serious or large scale use (in other words that might get popular) needs to be built to withstand either a "nation state level attacker" threat model or a "Zerg rush of hundreds of thousands of profit motivated black hats" threat model. The Internet today is a war zone because today you can make money and gain power (e.g. by influencing elections) by messing with it.
For Sybils: You've left the attack you're worried about pretty vague. IPFS itself doesn't need to tackle many of the sybil-related issues by being content addressable (so only worrying about availability - not integrity) and not being a discovery platform - so not worrying about spam / influence. For the remaining degradation attacks - someone overwhelming the DHT with misbehaving nodes - there's been a bunch of work in this release looking at how to score peers and figure out which ones aren't worth keeping in the DHT.
This becomes hugely problematic the minute you start using IPNS. On one side of the split, the name `foo` can resolve to `bar`, but on the other side, it resolves to `baz`. If you're trying to impersonate someone, then a netsplit would make it easy for you to do so (barring an out-of-band content authentication protocol, of course).
> You've left the attack you're worried about pretty vague. IPFS itself doesn't need to tackle many of the sybil-related issues by being content addressable (so only worrying about availability - not integrity) and not being a discovery platform - so not worrying about spam / influence.
On the contrary, a Sybil node operator can censor arbitrary content in a DHT by inserting their own nodes into the network that are all "closer" to the targeted key range in the key space than the honest nodes. This can be done by crawling the DHT, identifying the honest nodes that route to the targeted key range, and generating node IDs that correspond to key ranges closer than them. Honest nodes will (correctly) proceed to direct lookup requests to the attacker nodes, thereby leading to content censorship.
Honest nodes can employ countermeasures to probe the network in order to try and see if/when this is happening, but an attacker node can be adapted to behave like an honest node when another honest node is talking to it.
> For the remaining degradation attacks - someone overwhelming the DHT with misbehaving nodes - there's been a bunch of work in this release looking at how to score peers and figure out which ones aren't worth keeping in the DHT.
Sure, and while this is a good thing, it's ultimately an arms race between the IPFS developers and network attackers who can fool the automated countermeasures. I'm not confident in its long-term ability to fend off attacks on the routing system.
I don't think I see the impersonation / bad resolution problem though. IPNS records are content addressed to the key. Having control of a portion of the network isn't sufficient to compromise that (you can prevent availability though).
Check out our most recent research discussion for some thoughts about how we might scale up Sybil resistance in p2p networks: https://www.youtube.com/watch?v=L4SJzoKHKPk
IPFS has been a real pain to work with in the past (the node would just consume all RAM and CPU and had to be restarted a lot), but it's been getting better, which is great to see.
I really hope it gets good enough to run on everyone's desktop machine, since that's the way IPFS is meant to be deployed (rather than just on gateways). It seems that it doesn't take up too much RAM or CPU now, but it looks like it might be a problem bandwidth-wise, if you host some popular content.
Still, great news overall.
The improvements announced are substantial.
Edit : makin jokes about plagiarism people.
Edit: The way I see filecoin working is anyone can post a reward for a file and once the file is provided, the reward is paid. In other words, it's bit like a brokerage that connects downloaders with uploaders. The difficultly is that this brokerage needs to be distributed and resilient.
If that is true, it means that a node can only be profitable if you are freeloading. And if you are freeloading, any price you get will be good which means that it tends to go even further down, perhaps even below the commodity cost. I may be missing something, but I really don't see this going beyond techies with spare disks playing around and definitely no way to run a Filecoin node on a VPS profitably.
Right now if I wanted to compete with Amazon, I could buy 2 or 3 petabytes of storage space and some bandwidth. But nobody would trust me not to lose their data. This is the differentiation that makes cloud storage less of a commodity than it could be. The goal with filecoin is to make it so you don't have to trust me, just some general guarantees about the filecoin network. If amazon for some reason was selling storage on this network, I could compete with them on equal footing (and whoever sold it more cheaply would win).
Many commodities are in close-to-perfect competition. For instance, I don't care if grain comes from England or France, just that it's cheap. While it is very hard to make an economic profit selling grain, many people do farm grain and make enough money to support themselves (The money you pay your workers to survive is one of the costs of producing grain. That's true even if the only worker is you)
If you could make a profit by running a filecoin node on a VPS, everyone would do that. By competing with each other you'd all bring the price down and down until it was no longer more profitable than selling any other commodity (and maybe even lower). So you're right to not expect to be able to do that.
> which is the ideal state for a commodity, there is no economic profit.
No. A commodity is simply any kind of good that is considered equivalent no matter who made it. Oil is a commodity, there is plenty of competition and there is still plenty of profit to be made (current crisis excluded)
> While it is very hard to make an economic profit selling grain, many people do farm grain and many enough money.
The analogy here would be to get people to (a) "farm" disks (and electricity/internet) and (b) with some profit to justify their work. (a) is impossible (except for freeloaders) and you agree with me that (b) seems to be unlikely.
So the question is: who would be interested in taking the supply side of the filecoin network, when there is no profit to be made?
You're not going to be running a node on your spare capacity that brings in more than your costs, in fact unless you're a datacentre operator, you're going to be making a loss.
In the end it favours big centralised services, and IPFS (if it ever takes off) just becomes a rival API to S3.
1) Accounting profit: revenue - expenses. This is colloquially what most people mean when they say "profit". To break even you just need to make enough money to cover your explicit expenses. In your example, you would need to make more in Filecoin income than you spent in disks/electricity/etc.
2) "Normal profit": this is a technical term in microeconomics. Basically this takes into account opportunity cost; the cost of not doing some other profitable thing with your time/money. By these standards, to break even you'd not only have to cover the cost of disks/electricity but also the "opportunity costs" of whatever else you could have been doing with those disks/electricity (like mining Bitcoin), or whatever you could have been doing if you never even purchased those disks to begin with (like investing in the S&P or whatever).
3) "Excess" or "economic" profits. These are extra profits on top of whatever you earn in #2. This is the part that goes to 0 for commodities like milk/gasoline/Filecoin but is non-zero for differentiated products like iphones.
For Filecoin specifically, I would expect that normal profits would probably be above the raw costs of disks and electricity, since it takes some effort to purchase and administer all that hardware, but probably below the cost of renting a VPS, since a VPS is typically used for more profitable things than just sitting around and holding stuff on disk.
> I would expect that normal profits would probably be above the raw costs of disks and electricity, since it takes some effort to purchase and administer all that hardware
No, it doesn't. Most of the people selling the idea of distributed file storage (Filecoin, Storj, Sia) are thinking about getting the excess capacity of everyone's disks and turning into a marketplace. Getting the hardware and administering are sunk costs. Participants are only supposed to download a client, indicate the payout method and ensure that they don't turn off their computers. It's a one-off effort. Even understanding (or pretending to understand) all of the crypto-babble in order to get paid is something that people are supposed to do only once.
This is the part where these different projects are working out their market dynamics and where I get lost about Filecoin:
- With Storj (and Sia, I believe) the price of the data is determined by the governors of the network. Storj last rollout was paying $5/TB-month for every client. There is no bidding war. As long as you could prove you held the data for that period, you would get the money. This makes sense for me - no matter if I am some dude in a basement with a disk on my laptop or if I am a datacenter and I get to charge more than $5/TB-month from my usual customers, my excess capacity can be sold for $5/TB-month and this becomes a simple matter of waiting for the demand to grow.
- With Filecoin the idea is to have a completely open market, where each node says how much they are willing to charge for the data, and what are the constraints (availability, time to retrieval, etc). This leads to possible (inevitable?) bidding wars. No matter the demand curve, there is always someone who will be willing to put the lowest price possible that gets them "something". For excess capacity, this "something" can be effectively zero or whatever is the lowest price possible in the bid. And if it can be zero, the only thing I see helping them would be if the nodes starting cooperating with each other and acting like a cartel.
To be honest, even Storj and Sia are hard sells for me. I wouldn't bet on these projects for a long run and I don't believe that they will actually threaten the current players, but at least I understand the incentive structure. With Filecoin, it can only be profitable when the demand for data storage grows faster than the storage capacity from the network, but no so profitable as to attract people from the demand-side to the supply-side. It's crazy.
If you are using this seemingly free connection capacity, internet access rates should go up, because there is no such thing as a free lunch?
Only if these enthusiasts are willing to take a loss on power.
> If the price hits the price of cloud hosts, then people will just run tons of Filecoin nodes on cloud hosts
Or the cloud hosts themselves will enter the market. The upper bound on the market will be their costs, which will always be lower than the costs of enthusiasts using spare disks at home.
Ok. Agreed.
> If all that capacity gets used up, then supply and demand will cause the price to go up until more people add capacity.
Agreed.
> If the price hits the price of cloud hosts, then people will just run tons of Filecoin nodes on cloud hosts.
Not necessarily. I can just buy more disks and run locally, whether for hosting my data or to sell the space on the network, which brings the price back to people-selling-spare-disk-space-for-cheap level.
> Then anyone that can undercut the cloud hosts will be free to run their own Filecoin nodes, and the price will be pushed down as people do this.
Yeah, but that is kind of my point. Everyone will undercut each other until the price hits the bottom. The equilibrium will be at most the commodity price.
How does that work in real life?
Of course, in the shorter term there are many different scenarios where the market is slower to adjust. The STORJ network has been subsidizing their node operators with VC money for example, leading to some excellent unit economics for the early adopters.
> The primary focus of this release was on improving content routing. That is, advertising and finding content. To that end, this release heavily focuses on improving the DHT.
> A secondary focus in this release was improving content transfer, our data exchange protocols
Other improvements to data store, libp2p, etc.
source: https://github.com/ipfs/go-ipfs/blob/master/CHANGELOG.md
Wow. Thank you IPFS community.
- Pin management is far to simple for any real use case.
- GC is a complete GC on an arbitrary threshold.
- API is based on the CLI and really weird (ex GET /api/v0/cp?arg={src}&arg={dest}, why don't the parameters have names? Why is it GET?)
- They don't prove a way to upload a directory with references to existing IPFS objects, so you need to manually encode the data.
- The manual encoding uses go-style names, they seem to have let their language choice leak into the protocol.
- Lots of minor issues, such as the API implementing block size limits by simply cutting off the input data at the size limit, this is even before encoding it into the format where the size limit should be calculated and cause data corruption in some cases.
- Their one abstract API for creating directories (MFS) has a number of issues.
- It is incredibly slow for some reason.
- It kindof but not really automatically pins all content recursively references.
There are other weird choicess such as their directory sharding using hashing. It isn't clear to me which use case this improves over a btree but someone probably thought it sounded cooler. Additionally the sharding appears to be a single layer of sharding which means that it still has a size limit (just larger).I ended up sinking a ton of time into https://gitlab.com/kevincox/archlinux-ipfs-mirror/ and in the end the number of small quirks were incredibly frustrating. I might go back and implement their hash-sharding logic to get the mirror working again but at this point I don't really want to interact with go-ipfs again.
I thought it would be an interesting project to get involved in, and I have a lot of expertise that would be valuable however it appears that it is all a bit of a mess which is a real shame.
Maybe with more success it can be re-written in cleaner fashion, as I said most of the issues are with the implementation not the protocol so there is definitely hope. I do honestly wish all of the success to the project.
Things like distribution of apt packages becomes much more exciting when each computer can choose to redistribute the packages it got to others in the LAN or area, even offline.
It's also very interesting to me how all the visitors to your website become servers as well, so content can never be "hugged to death" or links can never go stale, as long as at least one person has the content somewhere on their node.
That, to me, is huge, as links now go stale with some regularity. Think of all the Geocities sites (and all versions of them) just existing for ever, regardless if Geocities decided to shut down.
For example, here's my site on IPFS:
https://ipfs.eternum.io/ipfs/QmVW6JejQkjLnBJacR8qcZi88WNTMwi...
That can now "never" be lost, as long as someone cares enough about it to visit.
What new thing does ipfs make possible that would make users install the software to use it?
You visit a site, get the index and it references two more files with two URLs:
"./jquery-3.5.0.min.js" "https://cdn.com/4kVideo.mkv"
Your browsers get's those files, the first from the site, the other from a CDN. Your site scales according to the band-width that "cdn.com" is able to provide to you.
In IPFS, you visit a site, get the index and it references two more files with two URLS:
"ipfs://QmWYudWcbX6skKub5wg1Ga3LFh3vbW2k7PWfdqHtDYvAdp" (for fun I used the right address here) "ipfs://QmT9qk3CRYbFDWpDFYeAv8T8H1gnongwKhh5J68NLkLir6"
Your browser gets those files, the first is found in about a billion places. A lot of sites use jQuery so almost everyone has the file's content available, and because you'd already had visited another site with that content-hash, IPFS knew it could just use that one (a perfect safe cross-domain cache-hit). The second is found in less places, but it follows the same logic as a torrent, with data coming from both a IPFS cdn the site uses, along with a few people who've also seen the video, so it loads faster from a few sources than one.
Hope this makes sense. IPFS is really just a decent way to implement websites like a torrent and enjoy the benefits that brings. It's not grand or out of this world, just decent space-saving data-management.
If disagree please explain why.
Basically I host your file and you host my file. With some algorithm to make sure number of seeder for each block never go to 0
Rust needs to start edging out Go for new tools. Kubernetes, Envoy, IPFS, etc. would have benefitted from it.
It might not have been time four years ago, but it's time now.
Rust is a great language, but right now I can start a project in Go and have someone else start a project in Go and have something up and working in a week. Rust still takes significantly longer to learn and/or find skilled developers for. Go is up there with Python in terms of "Get something out now".
I guess that time will never come...unless Go somehow looses its way
Before downvoting, this is our experience.
Even in a globalized developer market a smaller developer force is a risk.
In German we have a saying "to die in beauty" (in Schönheit sterben), does that translate? Your boss wants to get things done in due time and I see Go outcompete Rust in this respect.
If we revisit that now, C++ is the wrong language to start Envoy in today.
This is useful for p2p systems in which you are connected to thousands of peers multiplexing several p2p protocols with each concurrently. You want to be able to switch threads as fast as possible.
The concurrency model is a bit easier to reason about too.
It would be good to see how Rust and others do. Other languages excel at other things (i.e. embedded devices ).
Honestly I'm not sure why Go was picked specificall, but if you put things in the context of 5 years ago it doesn't seem a crazy choice.
They're very sensible about it, and advocate using the right tool for the job.
https://blog.discord.com/why-discord-is-switching-from-go-to...
A big reason why C is still widely used. A library that's written in C can be used in any other language.
go-ipfs is where the majority of new development happens right now as the desktop/server implementation (compared to js-ipfs=browser & rust-ipfs=IoT) - but if you want to prove that Rust is clearly better/faster _in general_ - have at it: https://github.com/ipfs-rust/rust-ipfs ;)
I think the thing jbenet was selecting for back in 2013 was concurrency support & modularity, and golang is still a decent choice for that. Rust 1.0 didn't happen until 2015 after the go-ipfs alpha was already out - but agree it's made awesome progress since then!
But fair point :)
And do IPNS entries finally last more than 30 seconds? It would be nice to not have to constantly keep a node up just to have an IPNS entry.
Here's what's new for IPNS in the release: https://github.com/ipfs/go-ipfs/blob/master/CHANGELOG.md#ipn...
Like this https://blog.ipfs.io/2020-02-14-improved-bitswap-for-contain...
Note that this is also similar to Kraken[1] from Uber, and Dragonfly[2] from Alibaba. Facebook also does container and artifact distribution using BitTorrent, but I can't find a good reference to it.
the py-ipfs-http-client library is also actively maintained (but I think needs some small changes to work with IPFS 0.5): https://github .com/ipfs-shipyard/py-ipfs-http-client/
The path to upgrade the dependencies is probably to run `go get -u <dep>` for the direct dependencies you're including, and then fixing errors that pop up from doing so.
Usually take go-ipfs's go.mod as a guide on what versions to use.
Yes, this is the problem I initially had; I would end up with a host of incompatible deps (leading me to wonder what the point of go.mod even was ... but I digress...).
Sounds like this is going to be one of those super fun dependency hack-jobs :C
Ah well, this is the price to pay for playing with beta software!
I already ported the Go code to iOS once, and it wasn't that painful, but functionally it would be a lot more useful as a C++ code base. I don't build browser apps - I feel there is still a need for native apps especially in this particular space.