"A switch (or an L2 switch :-) ) is an L2-only thing."
I don't know what L2 means. I suspect a definition of the various levels would expand the audience for this post.https://en.wikipedia.org/wiki/Internet_protocol_suite#Compar...
> The IETF protocol development effort is not concerned with strict layering. Some of its protocols may not fit cleanly into the OSI model, although RFCs sometimes refer to it and often use the old OSI layer numbers. The IETF has repeatedly stated that Internet protocol and architecture development is not intended to be OSI-compliant. RFC 3439, referring to the Internet architecture, contains a section entitled: "Layering Considered Harmful".
Anyway: People sometimes like to pretend that OSI is a model and TCP/IP implements the model, forgetting that OSI is/was a protocol stack and TCP/IP has no interest in being "compliant" with any other protocol stack to the extent it mimics its layering architecture.
This is the inside view of how exactly a router operates. You only need to know this if you are poking inside a router implementation. If that is the case, my condolences.
If you're poking inside a router implementation, it seems fair to expect that you have a basic understanding of OSI networking layers.
Ethernet. L2 means Ethernet (or WiFi). Ethernet is the envelope we put Internet traffic in (L3) and the layers above that are about nailing down how exactly a conversation is managed. Sometimes people get upset about what constitutes Layers 5-7, especially since that Tim Berners-Lee joker ruined all the pretty pictures with HTTP. So mostly we only talk about 2,3,4 and 7, in the same way you don’t bring up religion or politics at a family reunion.
This is the first time I am reading this, I interpret this to mean HTTP is badly designed and Tim Berners-Lee caused it. Need more...
I'm aware there are levels of information in an IP packet, but I don't know them offhand. If I have to google something on the first sentence in a high level overview, then I'm likely not going to read the piece and the author has lost me as a reader. Maybe I'm not the target audience, though I was interested. I'm providing that as feedback for the origial author since the piece mentions that's it's still a work in progress.
Then the author wouldn't have to, and we wouldn't need to use a search engine!
[0]: https://en.wikipedia.org/wiki/OSI_model#Layer_2:_Data_Link_L...
Discoverability lives in the space between overexplaining and underexplaining.
Correction: the network stack has layers, where IP is one of them, near the top.
Which is why most software targets IP. It’s a good abstraction and it’s portable.
1. physical layer, 2. data link, 3.vnetwork, 4. transport, 5. session, 6. presentation, 7. application layer.
So, many switches are layer 2, but layer 3 switches are often referred to as switching routers. This can cause two different switches to act differently from each other in certain network environments. It isn't that one switch "doesn't work" but that it isn't a router.
A router is nominally a L3 device, though most actually are L1-7. To work, you need L1 & L2, but in today's world, there are applications and interfaces that move the router across L1-7, though not to the same depth as purpose built application devices for example. Topping this off, some routers will switch and some will not. It's the same wide-world of words that we see across the whole computer industry.
The OSI model differs from the TCP model of networking, even though both use numbered layers.
Some devices can do L2 and L3 at the same time. That’s why another term came up: L2 only switch.
And so on, you can read it more on [1].
This clears some points that always puzzled me:
If the gateway is identified by an IP address, but the destination host is also an IP address, which address exactly is put into the packet? And how can a packet be routed if the gateway's IP is itself part of the subnet that's supposed to be routed to it. (E.g. 192.168.0.0/24 with default gateway 192.168.0.1)
So the answer is, if I send the packet to host 1.1.1.1 but the routing table has 2.2.2.2 as the next hop, the packet will have 1.1.1.1 as the destination in the IP part but the MAC of 2.2.2.2 as destination of the Ethernet part (or equivalent). It doesn't matter which subnet the next hop's IP is in, as the routing table isn't consulted for it anyway - it's only used in ARP)
This leaves the question, why the indirection and why the mucking around with ARP and IPs that are never used as the destination to anything?
Couldn't you simply put the next hop's MAC address (instead of IP address) into the routing table and be able to route packets just as well, with a lot less complexity?
Ethernet's addressing scheme was not designed to accommodate large hierarchical networks and so is unsuitable for the IP use case, but more importantly, IP was designed completely separately from Ethernet, and was not used primarily with Ethernet until later, so IP could not "assume" that the layer below it handled addressing (typically there was either no layer below [point-to-point] or only a very simple one).
The result is that Ethernet and IP duplicate functionality to some extent. It is theoretically possible, although not common, to build a network which uses only layer 3 routing without any reliance on Ethernet addressing. A significant reason this is rare, arguably the most significant reason, is that IP is now carried over Ethernet a significant majority of the time and L2 Ethernet devices (like switches) require the use of Ethernet addressing for the network to function. You usually see "pure IP" in virtual networking environments where the IP is encapsulated in, well, more IP, but even then Ethernet frames are sometimes used because, well, just like network hardware, operating system network stacks generally expect them (examine, e.g., the linux bridge implementation). It is completely possible to build network stacks and network appliances which do not require the use of Ethernet but it is expensive and there's not much of a motivation to do so, and you'd run into issues with any kind of equipment not so designed.
Addressing is not the only duplicate functionality between Ethernet and IP, and it's one of the less significant ones since Ethernet addressing does provide utility even if not strictly required. Ethernet frames are checksummed, and IP headers are also checksummed, even though the Ethernet checksum is already over them. The IP header checksum exists because IP was historically carried over lower layers that did not provide integrity checking. This is basically pure wasted space in typical networks, so IPv6 drops the header checksum to remove the overhead.
In general, though, network protocols tend to make more sense when you have some awareness of the history of their development, as when you try to view the modern internet as an elegant, monolithic design as some authors attempt, a lot of things won't make sense because they simply are that way for historic reasons. Ethernet and IP were each designed in the '70s, but separately, and their use has accumulated significant cruft since then, including some radical changes in the ways that they were used (for example the transition of Ethernet from shared media to point-to-point, which occurred de facto earlier but became largely formalized with the introduction of GbE which prohibits more than two hosts in a collision domain, and of course ironically the introduction of multiple hosts in a collision domain as an even larger issue with wireless protocols, which requires additional handling below, or actually in lieu of, the ethernet layer, 802.11 being a replacement for ethernet that happens to behave similarly in many ways for compatibility).
Finally, the OSI model is something that tends to add complexity and confusion to these discussions, which is why I doggedly discourage its use in teaching. The OSI Model describes the OSI protocols, which were contemporaries competitors to the TCP/IP protocols. Arguably, one of the reasons that the OSI protocols fell out of use (in favor of IP) is exactly because they assumed seven layers, and each was fairly complex. Some OSI protocols are still in use, for example IS-IS (OSI layer 2) in the telecom industry and some backbone IP transit, but in niches and generally being replaced with IP. IP is intentionally simpler, and can be fully described using four layers, what's usually referred to as the TCP/IP model.
The OSI layers do not map 1:1 to the TCP/IP layers, even if you simply ignore the ones that map more poorly as instructors often do. Even worse, many instructors and textbook authors feel such a strong compulsion to map modern networks to the obsolete OSI model that they cram application-layer protocols into OSI layers 5 and 6 in order to have examples of them. I have seen cases as extreme as an instructor claiming that HTTP cookies represent the session layer. This kind of thing is nonsense and hinders understanding rather than contributing to it. If the OSI model is taught (not a bad idea at all as students should realize that TCP/IP is merely the popular way, and certainly not the only way), it should be taught specifically by contrasting it to the different TCP/IP model. Unfortunately few instructors and website authors today seem to even be aware that the OSI protocol stack existed separately from IP.
And, if you are wondering, yes, Ethernet can be used in a switched network completely independently from IP (although not really in a routed network unless you are generous about how you define routing). This was more common decades ago, the only equipment I have ever personally encountered that used bare Ethernet was a very outdated CNC setup.
Request. Do TLS next (if it’s in your wheelhouse). I’ve been looking for a good summary of ECC and selected curves in tls 1.2
You can only ARP for hosts on the same subnet as you, terrible hacks excluded.
> This leaves the question, why the indirection and why the mucking around with ARP and IPs that are never used as the destination to anything?
Because it was designed in layers so that different layers could be replaced. We didn't know we'd end up with mostly only IP and Ethernet in LANs back then.
> Couldn't you simply put the next hop's MAC address (instead of IP address) into the routing table and be able to route packets just as well, with a lot less complexity?
It could have been done in any number of ways. It's not that much complexity through and it would bake Ethernet MACs into everything IP, even in the cases where it's not needed.
That's not /the/ reason why a MAC address is involved. It's because that's the address for a physical device at a lower layer in the stack. As others mention, IP is media-independent, it cannot depend on a lower tier addressing scheme without becoming fused to that medium
It is actually a pretty reasonable way of integrating hardware MACs directly into the internetworking stack.
Most of the very expensive 'multilayer' switches [2] do a form of this where they associate a next-hop IP with a MAC address entry and store that in the TCAM or data layer. It's not used as much because Cisco has a ton of patents on this type of technology, and also because general purpose hardware has gotten quick enough that it's not as important as it was ~15 years ago...
[1] https://en.wikipedia.org/wiki/Point-to-Point_Protocol
[2] https://en.wikipedia.org/wiki/Multilayer_switch#Layer-3_swit...
One reason why using an IP is still important is the IP can move to a different router, so the MAC for that IP can change. Eg if a hardware swapout was performed, or the network admin manually moved the IP, or some HA system that dynamically moves IPs to other routers (and isn’t VRRP, which uses a virtual MAC).
Usability: it’s a lot easier imo to read a routing table with IP next hop than MAC as you don’t have to remember what MAC every machine is. The IP also conveys visually which port the traffic is (probably) going out. Eg Port 1 - 192.168.1.0/24 Port 2 - 192.168.2.0/24
If my next hop for 1.1.1.1 is via 192.168.2.254 I know immediately it’s going out port 2. If it was a MAC I’d have no clue unless I memorised all MACs in my networks.
Qemu (and I think Docker too?) use SLIRP internally for access between VMs which is ultimately an IP layer bridge.
On the WAN side (at least at one point, I could be out of date here) they didn't use Ethernet, but instead IP layer routing as well, on top of stuff like PPP and SONET.
This is exactly what Cisco Express Forwarding (and similar layer 3 switching technology) does. The adjacency table keeps all of the layer 2 information to be used for fast routing of packets. This was implemented on the CPU back in the day, but now usually done in the switching ASICs.
However, you still need layer 3 next-hop information in the routing table (and dynamic routing protocols). The reason being 1. ethernet is one of many layer 2 technologies that IP supports and 2. MAC addresses can change for a particular IP address due to various reasons including hardware replacement and HA.
Several others have already answered your question -- the key points being "the OSI model" (e.g., layer 2 vs. layer 3) and the multitude of other layer 2 protocols which don't use MAC addresses -- so I'll mention one other important detail.
---
Although the Ethernet protocol itself has been around for ~40 years now, for the majority of that time it mostly only existed "in the LAN".
In fact, when it comes to "on the WAN", Ethernet is still a relative newcomer. Before ~15 years or so ago, pretty much no one was using Ethernet "on the WAN" -- instead, it was X.25 and frame relay and HDLC and PPP and ATM and POS on analog "leased lines" and ISDN and DS-{1,3}s and OC-{3,12,48,192}s.
Along came MPLS, MetroE, EoMPLS, Carrier Ethernet, etc., and soon enough everyone was "tunneling" Ethernet between sites but we were still mostly using those "legacy" protocols "on the WAN".
Over time, technology advanced to the point that "native" Ethernet eventually became feasible "on the WAN" -- in no small part because 1) Ethernet speeds kept increasing by an order of magnitude (!) every few years, 2) standardizing on Ethernet everywhere drove the costs down, and 3) Ethernet was "easy" (compared to all of those "WAN" protocols we were using up until this point) -- everybody already "knew" Ethernet because, by this time, everybody had been using it in their LANs for a decade or more!
Although ATM and SONET (at least) are still around in (some parts of) some service provider networks, they are now the exception and Ethernet -- to butcher a phrase -- "has eaten the world" but, as I mentioned, Ethernet "on the WAN" is still a relatively new thing.
---
So, I'll offer an alternative answer to your question:
> Couldn't you simply put the next hop's MAC address (instead of IP address) into the routing table and be able to route packets just as well, with a lot less complexity?
Sure, if you had done it about 30 years earlier!
No, because MAC address only makes sense for ethernet-like layer 2 protocols and IP can run over any number of layer 2 protocols, including point to point protocols and some of the point to point protocols.
Though I always thought the "router switch" was much more fun.
What does this mean?
Then towards the end... "the packet is recycled". What?
Also, with the ping of death, the only way to use it was to very noticeably crash systems -- not to secretly build a botnet or something, as might have been done with RCE vulnerabilities.
It wasn't super notable. What was more horrific was the amount of windows machines that had tcp ports for various windows services open to the internet that led to not only crashing but remote compromise and rootkits/botnet stuff. That went on for years and only got mitigated by people deploying routers with fw/Nat functionality.
But I do remember AppleTalk causing issues more frequently on a network I helped manage that had radio studios with two Macs per studio, but mostly Windows PCs through the rest of the building.
That place also had a Macintosh 512K running its phone system until around 2010!
As a software engineer working on IOS-XR, that gave me a chuckle :p
In the case of enterprise- and SP-grade routers, the data-plane - i.e., where the actual forwarding and lookups take place - runs entirely on a dedicated network processor (NP), mainly for performance reasons. Information on the NP is populated by the router's operating system in response to user configuration, network topology changes, or protocol state updates. On the other hand, the control plane runs mainly on the CPU(s). This is required so that the protocols running on the router OS (e.g., BGP) can receive and send out updates based on their state machines.
Good good :D
Thanks for the clear data plane / control plane explanation, that's a good way to summarise the distinction. May I link to it from the article?
Protocol-wise, isn't it common now for the NP on higher end stuff to handle L4 and higher protocols? Or are those still largely managed by the CPU?
I am looking for a "layer 3 switch" than has switching and routing functionalities without rebooting. If anyone know any software based open source solution for this it will be very helpful. Preferably with Cisco IOS like user command interface but it is optional but not mandatory.
Based on the article, it is explaining router internal based on P4. Perhaps I should try to use P4 for the above mentioned requirements?
Also, consider consider upgrading to the more active fork called Free Range Routing.
It’s still a little odd, but as somebody quite comfortable with JunOS (I run Juniper switches in my homelab) it’s pretty easy to pick up any of the Vyatta forks and hit the ground running.
I promise to make it better and actually finish it now! Check back in a day or two I guess? Also I should post the code I promised. Hello from the ADHD squirrel!
Routing is only triggered when the packet is L2 terminated: the destination MAC of the packet is one of the router's own MACs.
If the packet's destination MAC does not belong to the router, it doesn't matter what is in its IP header, it will be switched in the LAN it came in on.
This design also generalizes nicely to the case when the destination IP of a routed packet is one of the router's IPs.
Id look at the cisco press and CCNA training materials
This is not correct. The FIB(forwarding information base) is concerned with layer 2. The RIB(routing information base) determines the next hop. The RIB is what is used to populate entries in the FIB with the correct outgoing interface. These two terms are basic router terms. It was kind of surprising to see this statement in a post titled "How Do Routers Work, Really?"
When I first started working with routers, over 25 years ago, it was all ethernet LAN to serial WAN, usually point-to-point T1 or frame relay. On site had a dual T1, load balanced on both ports of a Cisco 2501. Fun times.
It can be more than a router though.