One of these days I really need to post my "ideas for languages" that I've got banging around on my hard drive, but one of them is "a language that deals with the increasingly heterogeneous nature of the computer". You've got the CPU, the GPU, efficiency cores, whoknows what else in the future (NN cores), and it's only a small hop from there to consider other computers as resources too.
Full disclosure: I have no idea whatsoever what this looks like. Especially in light of the fact that you need to build not just for the exact machine you're developing on but for machines in the future as well. Some sort of model of what is being computed and some guestimate at the costs? (Something like an SQL query builder where you declare your goal and it does the computation about what resources to compute it with?) It's also possible that the huge gulfs in performance between all these parts are just too large to bridge and manual scheduling of all these resources is just the only choice.
Even just within a CPU it's rather annoyingly difficult to use vector-based code in modern languages. Perhaps something like an array-based language, but one that discards that field's bizarre love affair with single-character (if not outright Unicode) operators and can be read by a normal human, and just affords writing code in a style that SIMD becomes a sensible default rather than something the optimizer laboriously reverse engineers from your conventional imperative code. (Array based programming could really use a "for humans" version of those languages in general.)
To some extent, just sitting down for a year to learn modern assembler and starting from the very, very bottom once again to build a high level language, rather than starting with C and building "C, but ..." which is pretty much every modern language being developed, would be an interesting exercise if nothing else.
Another little example is I think Jai was supporting structures-of-arrays instead of arrays-of-structures, though I don't know if they kept it. I'd like to see a language where the language-level data structures are explicitly viewed through the lens of "how I serialize these into memory", rather than the data structure implicitly creating such a specification by how it is defined, so for instance you could swap out a SoA to an AoS by swapping only the way the compiler serializes to RAM and not any of the rest of the code. Obviously you provide defaults that look like modern languages, but with this you could directly implement things like tagged unions with custom bit layouts, or theoretically, directly accessing gzip'd data by specifying that this data structure can only be accessed sequentially but as long as that's what you do you don't need to directly unzip it, etc. This doesn't directly answer "how do you utilize modern hardware correctly" but gives you tools to potentially create a better match than what compilers give by default.
Again, to be clear, this is crazy pie-in-the-sky far out ideas that I do not have an implementation in mind for, but it's the sort of thing I'd like to see more experimentation with on the fringes of language dev. (And I only wish I had time to do it myself. Unfortunately, I simply do not.)
(And, as the sibling comments point out, yeah, assembler technically, but that's kind of a cop out.)