> in fact, they actually compound the problem by encouraging significantly more usage
because if eliminating training costs makes running the model above cost, the problem is helped by significantly more usage not compounded.
More usage compounds the problem only if inference is unprofitable.
(the article briefly mentions training but that's later).