The story as I learnt it goes around this way - hopefully on this forum someone with first-hand knowledge could chime in:
1. Ethernet happens, is designed around bus topology with shared medium and everyone talks by filtering out messages for themselves (with half of the addresses for multicast)
2. Digital works on moving ethernet from bus to star topology, design explicitly disallows connecting stars to each other without L3 router
3. Unfortunately, a non-trivial product range ends up based on LAT - essentially serial port over ethernet - and supposedly because of miscommunication LAT is very... raw-ethernet solution. No way to route it sensibly.
4. Suddenly, there's a need for larger L2 segments, except Ethernet has no way to support them (it finally gained one around starting ~2005 by throwing everything you know about L2 switching out)
5. It's too late to add features to ethernet that would make it work in larger span than single star, and possibility of loops bringing down exists, so do multicast storms (those weren't fixed).
6. The budget doesn't allow to put in a lot of computing power, a z80 gets thrown in. Spanning Tree Protocol gets created in vague hope to mitigate the curse of large L2 ethernet zones. We get stuck with primitive MAC learning
7. Genie is out of the box, and since you can crap out a too-large ethernet network much cheaper than do a proper routed one, the curse continues. Since cheap is the king, you often do not even get STP. Large scale networks fail when interns misconnect cables, multigigabit backbones end up doing 10mbit because STP made an ancient switch in the cleaning closet into root of the tree. Cats and Dogs living together, etc.
8. From around ~2005, proposals to fix it proper show up. Solution? Put routing into ethernet, using IS-IS for routing. On the other side, increasingly crazy centralized "decentralized" SDNs also try to setup L2 forwarding to deal with applications that can't deal with real IP subnetting. Somehow passing ethernet over XMPP over TLS (with BGP involved somewhere) is still better than ethernet's mac-learning.