You may need to "right-size" the models you use to match your hardware, model, and TPS expectations, which may involve using a smaller version of the model with faster TPS, upgrading your jardware, or paying for hosted models.
Alternatively, if you can use agentic workflows or tools like Aider, you don't have to watch the model work slowly with large modles locally. Instead you queue work for it, go to sleep, or eat, or do other work, and then much later look over the Pull Requests whenever it completes them.