That makes it hard to evaluate the genAI parts of the application, and also iterating on the prompts is not as straightforward as opening up a playground.
Having the config be the source of truth let's you connect it to your application code (and still source controlled), lets you evaluate the config as the AI artifact, and also lets you open the config in a playground to edit and iterate.
For example, compare how much simpler openai function calling becomes with storing the stuff as a config: https://github.com/lastmile-ai/aiconfig/blob/main/cookbooks/... vs using vanilla openai directly (https://github.com/openai/openai-node/blob/v4/examples/funct...)
However, the prompt is your business logic in most cases and put your business logic into a separate file make it harder to read and harder to maintain.
Our basic premise is that AI application development should be config-based, so you can track the prompts, models and model parameters being used more rigorously. Having this AI artifact then lets you iterate on it separately from your application code, and also set up evals that provide "test coverage" for the gen AI parts of your application.
We were also inspired by the ipynb format for Jupyter notebooks, and you'll see parallels to that in the aiconfig format.
Please ask any questions, and share your thoughts on config vs. code.
A related question - I want to have chains of function calls - so that the prompts should contain all previous function calls and function call results. The chains can have variable length and end when the LLM calls a 'finish' function. How can I do that with AIConfig?
In particular for 1. teams that have complex slow deploys, but want to change prompt now 2. when there are data analyst types doing the prompts and people don't want them to be able to "break things". 3. being able to alpha test / rollout / target new prompts easily.
Definitely an interesting question whether prompts is code or configuration.