Ironically in chatting with Gemini, helped me realize that telecoms have often solved this problem of unlimited usage with rate/speed limiting. If one goes over 10GB or whatnot of data, then the speed drops from 5G to 3G, and data remains unlimited.
I wonder if there could be something like that, maybe even a progressive rate limiting, where after a certain number of tokens or another metric of use, then the speed slows down a LOT.
Not saying that I would love that as a consumer, as I'd prefer this all-you-can eat, unlimited data plan, but I wonder if that would be a compromise that could work, as it seems to have worked OK with the telecom space.
edit: the nerd in me loves the irony of me making the above comment and then later seeing your username as flux :-)