I believe that better hardware architectures will have more impact on AI than better neural network architectures.
You're correct that our architecture isn't adequate and biggest achievements lie there. I/O is not the problem, in fact, we have faster I/O. Because our I/O is dumb. We can place massive amounts of data in a linear memory buffer. But brains use massively associated memory structures. I/O of a network packet is easy. Associating that packet with preexisting context (such as TCP connection) is not that easy, requires structures, algorithms, memory locality, threading correctness, and procedural computing steps, because we abstract the context over a series of flat data.
If you're working on a subject hard, just a random flying info about something else that concerns you might trigger "I can't think about that right now" reaction in your brain, but the information has been digested. The packet has reached the adequate layer 7 ingress buffer just like that, but you don't want to context switch to the respective application intentionally.
There is also an elephant in the room and that is the native language, which shapes the way we think and process information. Imagine a CPU receiving an automatic microcode update the same moment when you as a programmer defined an abstract TCP stack in C or assembler, so it can optimize itself to the point of being able to switching to "thinking in TCP" mode.
And I'd be able to train a learned optimizer to replace gradient descent as the training process.
Even without either of those, performance improves in a predictable way with more compute thanks to scaling laws.