Changing prompts in small ways can lead to loads of unpredictable behaviors. And that's even more concerning as we build larger apps on something like LangChain, that requires the output to be very rigid.
My instinct would be to run a unit test-suite for every prompt change. Is there some already-existing framework for those? Or otherwise, how are you testing your changes?
No comments yet.