And if you have another suitable system, you can also plug it in. E.g. you might want to use another DHT that allows mapping from a key to some address data.
(Or, in so many words: an alternative for dynamic DNS without a centralized/hierarchical lookup infrastructure that punches through NATs without all the associated hassle).
I.e., the problem is "communicate directly with a node on the Internet by its unique ID".
The big question is: what do you solve that Kademlia (BitTorrent) doesn't?
Problem history goes like this:
* MAC addresses were made to both identify and address nodes on the Ethernet. They're unique and tied to nodes on a hardware level.
But they didn't facilitate routing between several Ethernet networks.
Linking several Ethernet networks into one big net through an arbitrary topology of hierarchies of routers, with 1980s commodity hardware was a challenge.
Without any structure in the MAC address itself, the address alone wasn't telling anything about where it's going to.
It is, in fact, not an address at all, as much as it is an identifier .
Side note: after writing this sentence, I double checked Wikipedia to make sure I'm not forgetting anything, and lo and behold, IEEE agrees with what I wrote! They're officially called EUIs now (extended unique identifiers), not addresses.
Analogy: MAC is like a person's SSN. It doesn't tell anyone about where that person is.
You can use it to give mail to the right person when you know the SSNs of everyone in the room.
* IP addresses were made to address nodes on the Internet in a simple way.
They are actual addresses, semantically, with different parts telling which sub-network to route to.
It worked OK when the net was small and relatively static.
As the net grew, the fact that IP addresses were not made to identify nodes became a problem.
As in: nothing in IP tells you what is attached to that address. And if you are on the net, your IP address may (and often will) change after a reboot.
As an analogy: IP is the street address.
It's good enough to mail specific people only when everyone lives alone in their own house and doesn't travel.
When they do, they have to give you the address of their hotel. If they don't do that, you can't reach them.
* DNS was made, in part, to solve that problem (and allow human-readable addresses). But it introduced quite a few others.
The list is pretty long, but a few are:
* Reliance on the bureaucracy of registrars / centralization (and having to pay a fee for a domain name)
* Complex setup
* High propagation latency (hours to days)
DNS was made to facilitate communication for the client reaching out to a server; centralization is inherent in design choices, as are some assumptions.
Like the assumption that the server isn't changing IP addresses too much, and that the people running the server have some control over that.
DNS propagation time being a quarter hour to several days long isn't a huge problem with that assumption. You paid for a static IP block anyway to run your site, right?
DNS was a step back from the decentralized nature of the Internet, heavily discouraging hosting on your own machines.
As an analogy: to make things simpler, you can now send mail to "Pepsico, Inc." without specifying an address at all, because the postal service maintains an address book where anyone can get listed, for a fee.
You still have no way to reach your friend after they moved.
* Dynamic DNS services only partially addressed this problem, being a bolt-on solution that puts you at the mercy of a dynamic DNS service. Which may or may not be free, and is outside your control.
(Self hosting your own dynamic DNS infrastructure is not fun).
Analogy: your friend goes out of the way to put "YourFriend, Inc" in the postal service's address book, and make sure to keep their address up to date.
* IPv4 addresses eventually introduced another problem which DNS alone doesn't solve.
There are too few of them.
Hence, NAT.
That's to say, an IPv4 failed in doing the one thing it was still doing: addressing.
It only became a partial address. In practice, (IP + port number) would be a working address, so with Port Forwarding you could host things on your network-attached computing device.
Analogy: the addresses are missing names of the people.
As apartment complexes replace single-person homes, the best you can do is specify apartment number along with the address.
The postal service ignores it, but the apartment complex management will (hopefully) put your letter into the right mailbox.
* This, of course, breaks Dynamic DNS as a solution if the node moves between networks.
You're generally not in control of port forwarding. And the port number is not a part of the IP address, so it isn't in DNS.²
Analogy: your friend is again unreachable, because they can't include their room number in the address book.
They stay in room #80, but it's reserved for the management in most hotels.
* IPv6 solved the problem of "not enough IP addresses", but not really.
IPv4 and NAT are still there; IPv6 adoption stalled at less than 50% worldwide².
Habits die hard. NAT is the poor man's firewall (and some folks love NAT so much, they made NAT for IPv6³).
Analogy: USPS rolls out a new address format, where each piece of furniture in each room in every apartment of every building is addressable.
Your friend can get their address in that format from their hotel's management when they travel within the US. Usually.
In China, they don't do that.
* VPNs "solve" the problem by having everyone connect to a central node, at which point it's just like Ethernet.
Aside from scale limitations, it's no different from any other client-server architecture; nodes need to communicate via a common third node on the Net.
Analogy: the postal service has "return to sender" envelopes that don't require you to fill out the address at all.
How it works (and why you can "return to sender", but not mail them directly) is beyond you⁴.
You don't know, and you don't care.
To communicate, you and your friends simply address all mail to Joe, your mutual friend.
On the letter head, you specify the addressee by name.
Joe sorts it all out, and puts all mail addressed to you into the "return to sender" envelope.
* NAT hole punching is using an intermediary to which both nodes reach to exchange the "return to sender" infusion, then using it while it lasts.
Analogy: instead of having Joe forward mail from friends, you all simply write Joe each day, and he sends copies of the other person's "return to sender" envelope in response.
Now you have a "return to sender" which goes you your friend (and vice versa), so you can write to each other directly.
* Peer-to-peer networks (Kademlia, Gnutella, etc) that emerged in the early 2000s have worked out an entirely different (to DNS) approach to identifying and addressing nodes, generally termed DHT (distributed hash table).
Instead of using a centralized/hierarchical/federated lookup table to do
(node name, DNS server address) → node address
Kademlia introduced a much more sophisticated approach: peer address → peer ID
(query ID, peer address) → list of [next peer address]
Where d(query ID, next peer ID) < ½d(query ID, peer ID) in XOR metric.This enabled O(long n) lookup convergence.
This solves many problems, but in particular, facilitated a distributed key-value store that doesn't rely on hierarchy/federation.
The node ID in a Kademlia network stochastically encodes routing information.
It's a key-value store where the node ID tells you something about the keys the node can provide value for.
Where IP has a rigid structure and reliance on subnet mask hierarchy (the first X bits say something), each Kademlia node is a router which stores information in a flexible (X bits of the address may something, but not any specific ones).
In short, Kademlia already solved the "MAC address on the Internet" problem in a decentralized way.
* This alone may still leave the problem of NAT hole punching for legacy networks.
Reminder, the entire problem amounts to having the nodes reach some other node (for the NAT router to open a port, i.e. create a valid "return to sender" address), and for that node to store/propagate that return address to other nodes.
But any node in a decentralized peer-to-peer system can do that.
NATs weren't obstacles to Kademlia more than a decade ago (see: libcage⁵).
* Iroh offers Kademlia⁶ as an option to retrieve the
(Node ID → address)
mapping, similar to DNS, and then offers a relay system on top of that for NAT hole punching.QUESTION:
What problem does Iroh solve that Kademlia (in particular, libcage implementation) doesn't?
My current understanding is that Iroh is just Kademlia with extra steps.
Help me out here :)
______
¹https://www.infoblox.com/blog/ipv6-coe/you-thought-there-was...
²https://dnsmadeeasy.com/resources/the-state-of-ipv6-adoption...
³https://serverfault.com/questions/940476/my-dns-record-can-o...
⁴Turns out, it's simple: hotel management puts their a green sticker on the return address for the mail you send out, so when they get responses with a green sticker, they give them to you.
They remove the sticker, so you never know it was there, and they pick colors at random each time — whatever is left in the pile.
⁵https://github.com/ytakano/libcage
⁶pkarr uses mainlineDHT, which is a flavor of Kademlia (also used in BitTorrent, among others).
Took a few hours while I was trying to understand what exactly iroh is doing.
It was good to refresh some things in memory along the way, and learn some too :)
With the relay daemon being self-hostable and OSS any use-case that needs to be more censorship-resistant than that has the option to run their own relays as needed.