I remember that John McCarthy used to condemn AI research as being a series of "look mom no hands" demos, but now we've got to "look mom I face planted". This work would be valid if it did things like :
- Describe an intelligent effect rather than "invoked an API" - so something like the system evaluating the portfolio and the market and deciding what should be sold or offering buying opportunities based on some inference (this is not a great example for many reasons).
- Measure and report the systems performance. The write-up says that the LLM fails, ok... how often?
- Describe the failure cases, provide some theory as to why some things succeeded and others didn't.
- Provide a way forward. What's next?
Without doing these things this work isn't helpful and is part of the AI/Agent/MCP hype. Basically it's Bored Apes for AI.