While there is a natural encoding of this process into a pointer to the target value, and a pointer to the next node, it is not the only encoding possible by any means.
A good exercise if you are having trouble with this is to implement a little linked list that performs those operations on something that simply backs to an array, including the blind O(n) copy when appending another list. It should be quite short. But that is not the only alternate implementation, either, and you can easily build others where the interface is maintained but you pay only log n additional factors maximum for operations, easily recovered from the constant factors which on modern hardware are often staggering.
Once you break the entanglement between memory representation and the API a data structure offers in your mind, many ways become obvious as to how to possibly improve this structure while maintaining the same operations, many of which of course already exist and have names.
But just because Lisp has cons and cdr does not mean the compiler is obligated to only and exactly use things that structurally look like linked lists in the actual implementation. You need to break the relationship between memory layout and API to understand what is going on here. You may choose later to put them back together; the naive memory layout is often a good one. (Linked lists nowadays are arguably one of the larger exceptions to that, but there are still circumstances even so where that may not be an issue; see Erlang's memory management for instance and ponder how that impacts linked list locality.) But you can't understand what compilers may be doing if you do not realize that there is a distinction.
If your argument is that a linked-list-like data structure can be implemented using something other than a linked list, then I agree with you. Vector tries (for example) are great for that use case.
But a vector trie isn't a linked list, it's a vector trie. As such, it will be faster for some usage patterns, equal for some, and completely degenerate for some others. Just like any other data structure that isn't a linked list.
It wouldn't really be an "optimization" to implement my linked list code with something other than a linked list - it would be rewriting my code.
That worked surprisingly well, but there were lots of edge cases since I am a professional bassoonist and not a compiler guy.
I generalized it and made all lists allocated in chunks of 16 and then let the GC truncate it. I spent a stupid amount of work doing the first thing and the second thing was done in an afternoon.
Then there are all kinds of
To have a good chance that there is space near the other nodes, you’d have to reserve memory up front for each list you’d want to do that with, though, and that will kill cache locality before that optimization sets in. Corrections welcome, but I don’t see that (easily) being a net win. But hey, with a sufficiently smart compiler at hand, you can do wonders. (Edit: when looking at pure functions, it many times may not be that hard to discriminate between “this allocation will be returned” and “this allocation is temporary”. For example, if you use map to apply a function to all elements in a list, and that function is pure, detecting that only the return values of that function will be visible after the function returns isn’t that hard)
An easier win is that lisp garbage collectors tend to move memory, and thus can try to move nodes that reference each other close together. You get that for free with a semispace collector, but that has quite a few disadvantages, so you don’t want to use that. A generational collector also will do it a bit, though (say you create a few million nodes of garbage to build a list of a few hundred results. Then, after generational collection, those few hundred cells will be closer together than before)