The subtle bit is just doesn't have to be for LLMs, as these are typically part of a system-of-models. E.g., we <3 RAG, and GNNs for improving your KG is fascinating. Likewise, dspy's explorations in optimizing prompts, vs LLMs, is very cool.
Oh man I am so torn between this being a fantastic idea and this being "building a better slide-rule in the age of the computer".
dspy is definitely a project I want to dig into more
Once RAG projects become important and good answers matter - we work with governments, manufacturers, banks, cyber teams, etc - working through data quality, data representation, & retrieval quality helps
Note that we didn't start here: We began with naive RAG, then relevancy filtering, then agentic & neurosymbolic querying, then dynamic example prompt injection, and now are getting into cleaning up the database/kg itself
For folks doing investigative/analytics projects in this space, happy to chat about what we are doing w Louie.AI. These are more implementation details we don't normally write about.
A year+ later, the most interesting kernel of insight to us from dspy is autotuning a single prompt: it's an optimizeable model just like any other. As soon as you have an eval framework in place for your prompts, having something like dspy tune your prompts on a per-LLM basis would be very cool. I'm not sure where they are on that, it seems against the grain for their focus. We're only now reaching the point where we would see ROI on that kind of thing, it took a long time to get here.
We do run an agentic framework, so doing cross-prompt autotuning would be neat too -- especially for how the orchestrator (ex: CoT) composes with individual agents. We call this the "composition problem" and it's frustrating. However, again, dspy and friends do "too much", by trying to also be the agent framework & runtime, while we just want the autotuner.
Identifying misinfo - Ranking & summarization based on internet data should be a lot more careful, and sometimes the controversy is the interesting part
For both, GNNs are generally SOTA