I don't think there's anything out there in the that matches the volume of deployed HTML, CSS and JS in the wild.
The horribly sad part is that HTML, CSS and JS are a gigantic Rube Goldberg implementation of "run arbitrary code in a safe sandbox," because the Web is also the world's biggest collection of legacy dependency.
IMHO, the source of the engineering cringe making everything so much sadder and less than what it could be is that the W3C/WHATWG/IETF/etc are made up of consortiums of large, foghorn-equipped corporations - corporations that have vested interests in advertising, consumer retention, and strong guarantees of indefinite consumption.
I've never really gotten the reasoning behind the technical directions the Web's gone in; a lot of things have stuck and worked, but so many more have flopped, yet the associated implementations for both the successes and failures have to be maintained going forward indefinitely.
The iterative pace on the various Web standards is another problem - things go so fast that the implementations can never get really really good, and Chrome uses literally all of your memory (whether you have 2GB or 20GB, apparently!) as a result.
---
Regarding $terminal_emulator being faster than VGA, I can emphatically state that virtually all of them are disasterously slow. aterm had some handcoded SSE circa 2001 to support fake window shadowing (fastcopy the portion of the root window image underneath the terminal window whenever the window is moved; use SSE to darken the snagged area; apply as terminal window background) but besides that sort of thing, terminal emulators have more or less never been bastions of speed.
If by VGA you mean true textmode (the 720x400 kind, generated entirely by the video card), I don't think there's much that's faster than that. Throw setfont and the Cyr_a8x8 font in there to get 80x50 (I think it is, or 80x43) and you have something hard to beat, since spamming ASCII characters at the video card's memory will always be faster than addressing pixels in a framebuffer.
Which is why GPU-accelerated terminal emulators are so interesting: they're eliminating as many software/architectural bottlenecks as possible to make those expensive framebuffer updates as quick as possible. It's definitely the way to go; games are generally rated on their ability to push GPUs to >60fps at 1080p (and increasingly 2K/4K/8K), so the capacity is really there.
The i3 window manager could be considered one of many comparable similar implementations to tmux. It's not perfect (it's not as configurable as I'd prefer), but it'd get you X and the ability to view media more easily.
I do really appreciate the tendency to want to view a computer as an industrial terminal appliance though. Task switching is still best done by associating tasks with different objects in physical space, so keeping the computer for terminal work and keeping tablets (et al) for other tasks does make legitimate sense.
---
Regarding data usage, that's a tricky one - most successful Internet companies provide some kind of service that necessarily requires the collection of arguably private information in exchange for a novel convenience. As an example, mapping services don't truly need your realtime location but having that means that they can stream the most relevant tiles of an always-up-to-date map to you. The alternative is storing an entire world map, or subsetted map(s) for the locations you think you'll need, but that'll kill almost all the storage on phones without massive SD cards.
---
I find elimination of unnecessary resource consumption a fun concept to explore, almost to the point of obsession. In this regard I often come back to Forth. I was reading this yesterday - http://yosefk.com/blog/my-history-with-forth-stack-machines.... - and it explores how Forth is essentially the mindset of eliminating ALL but the smallest functional expression of the irreducible complexity of an idea, often to the point of insanity. It's not a register-based language so it's never going to beat machine code for any modern processor, but it's a very very interesting concept to seriously explore, at least. (And I say that as someone interested in actually using Forth for something practical, as described in that article.)
---
AFAIK, the RPi actually boots off the GPU, or at least the older credit-card-sized ones did. I'm not sure about the current versions.
ATI released some documentation about their designs a while back with the subtext of enabling open-source driver development. I don't think that panned out as much as was hoped.
My understanding is that Intel has both NVIDIA and AMD beat nowadays when it comes to Linux graphics support; the two former vendors still heavily rely on proprietary drivers (on Linux) for a lot of functionality.
Sadly, since they both have to successfully compete in the market, they're unlikely to release their hardware designs in significant detail anytime soon. (Even if they did like the idea of merging, single gigantic monopolies have a lot of risk, and the behemoth that resulted would be impossible for Intel to compete with, likely.)
So, learning OpenCL and CUDA (depending on the GPU you have) is likely your best bet. There are extant established ecosystems of resources and domain knowledge for both implementations, and the relevant code is not too tragically licensed AFAIK.