I posted this yesterday https://github.com/day50-dev/petsitter
I use it with https://github.com/day50-dev/simple-llm-cli
And modify the "tricks" until my evals get to good numbers. It's a model by model basis.
This is what the larger firms are doing - they have custom prompts per model