A few months ago I published https://cpu.land (discussion: https://news.ycombinator.com/item?id=37062422). After cpu.land, I felt a lot of pressure to make another Big Giant Thing but didn't really have anything compelling. So I just hacked away on personal projects and, through some coincidental learning on how the Internet works, ended up hacking together a traceroute program that could live stream to a website from scratch!
I realized I had never seen this sort of thing on the web before, and it was actually a kind of cool and novel way of visualizing the structure of the Internet, so I polished it up and built a pretty site around it. In the process, I learned some really interesting things about how BGP and the structure of The Internet, so I melted the traceroute tool with an article sharing that knowledge.
I'm still hacking on this and I'm sure my code will manage to break somehow, so please let me know if you have any suggestions! :)
(Side note: why Rust? I don’t think programming language choice matters that much, but I wanted to quickly write a very dependable low-level program, and I really like Rust’s error handling primitives. Why do you care about this?)
Search for: "looking glass bgp" and you'll find some[1]. One of the first CGI programs I wrote nearly three decades ago (ugh...) was a Perl script that wrapped traceroute and streamed the results via server push[2]. Everything old is new again. :-)
That said, your site has a very nice presentation.
BTW, ipv4 TTL is dejure seconds even though it's defacto hop count since no router takes more than a second and the minimum decrement is 1 (except middleboxes which wish to remain hidden won't decrement at all). Also, Linux/Unix traceroute by default use UDP to a high numbered (and usually closed) port for probe packets instead since UDP historically is less likely to be dropped/filtered than ICMP.
Aside: asking how traceroute works is one of my interview questions, most people don't know (if they do the question is no good) and many are unable to figure it out from first principles no matter how many questions I answer about TCP/IP. I still think being able to figure it out is a reasonable problem solving question.
1. e.g. https://www.bgplookingglass.com/
Curious what type of roles you interview for, are they networking-centric? Iirc this is CCNA-level material, I'd expect anyone working in networking to be able to describe how traceroute works. I've used it more as a smoke test question than a question that most people don't know.
thanks! ive been wondering about this for ages!
It just says that outputTTL is (inputTTL - 1). With some exceptions.
[edit: I missed that that RFC is for MPLS but would be interested in your comment anyway; the definitive version seems to be https://datatracker.ietf.org/doc/html/rfc1122]
And WOAH, that's really interesting about TTLs. Thanks for sharing! That's awesome and terrifying!
TBF it's neither TCP nor IP, but ICMP :-)
Edit: I meant to write "neither TCP nor UDP" -- but even though I could make the correction I'll leave my error in place.
Huh. TIL. But how would a router know how long a packet took to traverse the hop? The packet doesn't have the information to figure that out … were they expecting people to configure routers to know how far away the previous hop was?
Also >1 second is most of the way to moon. (Yes, yes, speed in a non-vacuum is blah blah and switches blah buffers blah…)
- I wonder if you could get more accurate results by using TCP or UDP instead of ICMP. I think traditional traceroute has an option to use UDP, mtr [1] can use TCP or UDP, and tcptraceroute [2] can use TCP.
- This would be a perfect fit for some Talking Heads references. "And you may ask yourself, well, how did I get here?" [3]
[1] https://github.com/traviscross/mtr
[2] https://linux.die.net/man/1/tcptraceroute
[3] https://en.wikipedia.org/wiki/Once_in_a_Lifetime_(Talking_He...
2. Wayy ahead of you, check for HTML comments :))
(Now if only I could figure out how to enable traceroute to work on each hop from a given workstation through corporate cisco access switch, core switch, BGP tunnel to aws transit gateways, and eventually land at the VPC route table on the EC2 instance, then i might actually be able to call myself a network guy)
Unfortunately so many nodes ignore traceroute packets that it basically said my exit node connected to Linode and then Linode connected to your computer. I have the same experience with forward traceroutes, my router replies, my server replies, and if I'm lucky, one node in my ISP's network. The rest is locked up tight.
traceroute bad.horse
https://itsfoss.com/star-wars-linux/
But when I tested just now it didn't work for me so your milage may vary.
tracepath -m128 bad.horse
works just as well.Keep up the amazing work!
A technical nitpick though: Routes can be asymmetric—going across one path in one direction and another for the opposite. This means that your tool potentially doesn't show the route packets from the user took to reach your server, but rather the route packets took from your server to reach the user. I believe that querying with BGP looking glass tools would allow you to construct the route in either direction, but it is maybe a bit less cool looking than the real-time traceroute that is a result of actual traffic.
I'm posting here because it might be interesting for you. How it was built: https://presentations.clickhouse.com/meetup85/app/index.html
I was reminded of working at a company in 1996... we had windows 95(!) with Trumpet WinSock and a dial-up modem (24k, IIRC). I was just learning how all this stuff worked and fit together. I stumbled on a traceroute screen that would slowly drip out each hop and... it was magical to me. Suddenly realizing the idea of 'a big global network' I'd read about was actually... right at my fingertips, and I could see which computers my traffic was being routed through... that kept me up at night for a while. Not sure I'll say it was life-changing, but it sort of felt like it for a bit at that time :)
The narrative based traceroute in green is something I’ve never seen before. How many providers, like CDNs, did you take the time to map into a narrative?
This feels targeted towards folks who kind of already understand computers. It be cool to repackage this in a way that can show non-technical people the stuff they take for granted. The mountains that move on a user’s behalf under every keystroke is humbling.
One of my favorite books on this topic is Interconnections: Bridges, Routers, Switches, and Interconnections by Radia Perlman if you haven’t come across it.
Easier said than done, but don't feel like you have to provide a constant flux of interesting things. That kind of pressure ends up being toxic pretty quickly. Do what you enjoy, if it hits an audience, great, but don't feel like you have to make it happen.
Total side note, your other post (about cpu.land) is at exactly 1337 upvotes now :D https://share.cleanshot.com/ktVWL2pr
How fitting.
https://news.ycombinator.com/item?id=37062422
I'm 17 and wrote this guide on how CPUs run programs (3 months ago)
Kudos and carry on!
It's actually impossible. Responses are essentially free-form (if the server responds at all). I tried my hand at this; you can make an ad-hoc "parser" that works for 90% of addresses/domains (or you could, ten years ago when I tried). But the remainder are intractable.
Nowadays it's much worse; nearly everything is hidden behind privacy shields, which purport to protect PII. But WHOIS records aren't supposed to contain personal information; they're supposed to contain contact information for network operators.
This is ICANN's doing, I'm afraid. ICANN had a rule that networks should provide public WHOIS servers. They never enforced the rule, and now they've scrapped it.
Doesn't a whois have to include email, phone number and physical address? For a company that's not really PII, but I don't understand how it wouldn't be considered personal information in the whois for my personal website.
Not everything runs an RDAP server, though; I do wish ICANN/IANA or whoever would enforce that.
> Nowadays it's much worse; nearly everything is hidden behind privacy shields, which purport to protect PII. But WHOIS records aren't supposed to contain personal information; they're supposed to contain contact information for network operators.
Network operator info can also be PII. My info is PII, but I have a domain name, so putting my info into WHOIS is putting PII into WHOIS.
The privacy guard just forwards everything to me, minus spam.
(If it's a corporation, I don't think there's a good reason to permit privacy guards. But not all domains are owned by BigCos, yet.)
RDAP has the benefit of being JSON, but even then it’s a reaaally crappy format. For example, contacts are represented by the jCard pseudo-standard, which is a JSON version of vCard, and it’s completely awful and hard to deal with. Basically instead of a nice JSON object, it’s arrays in arrays in arrays…
RDAP should get better in the future versions, but I’m not sure registrars will follow in good faith because the initial specs were a bit of a shit show.
Could generative AI help out these days? "Here's a whois, give me [the info I want]:"
https://archive.nanog.org/sites/default/files/traceroute-201...
And yes, traffic is routing different ways from my Cairo office to my UK core --London ->Cairo is direct and still suffering massive loss, Cairo->London is now routing via ntt and seems fine. If they haven't fixed it by tomorrow might have to change some local prefs.
I'm working on it right now and hopefully will be working better soon! In the meantime I've increased timeouts so loading will be longer but it should work better.
Then one packet will get all the echoes in one go instead of having to send a tirade of packets with increasing TTL values.
If you care about time rather than packet count, you can send packets with all reasonable TTL values at once.
Just use a better client. Takes about 3 seconds to do an mtr -b over 15 responding hops from a server in London to something like 43.249.179.0 in the south pacific.
I guess it's supposed to do something like this: https://dnschecker.org/online-traceroute.php
tl;dr in my experience the networks traversed are usually very similar, and the content is relevant and interesting either way around
If you think about how IP works, you’ll see that this doesn’t particularly matter but that it can make understanding the routing more difficult.
Boise State University, and the University of Idaho are two schools at opposite ends of the state of Idaho. UIdaho in the north is close to Spokane, and almost all of its connectivity comes from Seattle. Boise is closer to Salt Lake so most of its connectivity comes via Portland or Salt Lake City. The middle of the state between the two schools is mountains and very, very little large scale connectivity at all, except there was a small line way bad in the day because the UofIdaho had remote classrooms in the southern part of the state. Sometime in the late 90's a network engineer from BSU and one from UofI realized that they both had switches and routing kit in the same building so they ran an ethernet cable between them.
The effect was catastrophic. It turns out that both networks happily started announcing BPG to each other, which in turn announced the connection to the internet as a whole. Suddenly there was a very short jump between networks in Seattle and networks in Salt Lake City. That poor little t1 (iirc) was absolutely getting saturated. But, interestingly only in one direction. See Boise announced the route, but Idaho didn't so the traffic was effectively only failing in one direction.
Needless to say the cable as disconnected and years later when I worked at the UofIdaho it was still well known that the two networks shouldn't ever be connected again! (Which was ironic because I was working on a program to setup I2 at both universities)
How did you manage to tilt the section header text? I've not seen that done before.
transform: rotate(-2deg) skew(-2deg);
transform-origin: bottom left;Holy shit. This girl's going places. I just skimmed https://kognise.dev and saw that in addition to the deep understanding of TCP/IP and all 7 layers of the OSI model she appears to posses, she also does front- and back-end development, embedded hardware, mobile apps, and compilers. She also rock climbs, can pilot a Cessna (all by herself), build robots, plays (and composes music for) the cello (since she was 5 years old apparently).
Do I need to keep going? This is nothing short of incredible. If I did 1/10 of the things this kid's already done by the time I kick the bucket I would have lived a full life.
.text h3 {
margin-top: 60px;
margin-bottom: -4px;
transform: rotate(-2deg) skew(-2deg);
transform-origin: bottom left;
}It's actually surprisingly easy to get an ASN for yourself and speak BGP. If you find building something like this tool interesting, you should give it a try. I wrote an introduction of sorts earlier (https://qt.ax/asn) if that interests you.
https://research.cs.washington.edu/networking/astronomy/reve...
paper: http://www.cs.washington.edu/homes/ethan/papers/reverse_trac...
This article from APNIC explains more about mtr and how to read it (plus some interesting details about how MPLS can obscure true paths)
https://blog.apnic.net/2022/03/28/how-to-properly-interpret-....
Also worth noting: It's also sometimes useful to trace with UDP, and many routers will drop ICMP selectively under strain.
Nice article, and excellent presentation!
I was wondering if we'd address this. That was my first thought - how can you do this without initiating ICMP from my side?
> Does running a “reverse traceroute” sacrifice accuracy? A little, actually.
> As I said when describing Internet routing, each device a packet traverses makes a decision about where to send the packet next until it reaches its final destination. If you send a packet in the other direction, the devices might make different routing decisions… and if one device makes one different decision, the rest of the path will certainly be different.
> This reverse traceroute is still helpful. The paths will be roughly the same, likely differing only in terms of which specific routers see your packet.
Sure... But it's pretty common for multi pathed AS' to traverse in all sorts of different ways. My experience (non residential) is that more often than not, the trace and reverse trace were different. Your upstreams and my upstreams have very different commercial agreements, and both peer and transit in multiple places.
Still cool though, well done!
9 a23-203-147-39.deploy.static.akamaitechnologies.com (23.203.147.39) 36.707 ms 36.783 ms 40.110 ms 10 * * * 11 * * * 12 * * * 13 * * * 14 * * * 15 * * * 16 * * * 17 * * * 18 * * * 19 * * * 20 * * * 21 * * * 22 * * * 23 * * * 24 * * * 25 * * * 26 * * * 27 * * * 28 * * * 29 * * *
I announce only IPv6, because I don't currently have access to any v4 blocks, they are expensive and I have little need for one.
Makes me imagine an online programming textbook that could to walk you through what your own custom code is doing. Very cool!
A Practical Guide to (Correctly) Troubleshooting with Traceroute by Richard A Steenbergen explains it well
Can I trace the location of an AS?