What it does - Connects any LLM to existing ROS robots via the Model Context Protocol (MCP) - Natural language → ROS topics, services, and actions (And the ability to read any of them back) - Works without changing robot source code
Why it matters - Makes robots accessible from natural language interfaces - Opens the door to rapid prototyping of AI-robot applications - We are trying to create a common interface for safe AI ↔ robot communication
This is too big to develop alone — we’d love feedback, contributors, and partners from both the robotics and AI communities.
The underlying stack definitely can connect to multiple robots simultaneously. Our current implementation is sequential, where the language model can connect from one robot to the next (and then back).
But it is definitely possible for us to write it to be simultaneous as well.
On the flip side, how would you handle conflicting commands from multiple clients? Is it last-writer-wins, or do you envision some arbitration layer? It feels like orchestration + conflict resolution will be key if MCP is to scale beyond single-robot demos into fleet-level use.
What excites me most is the potential for MCP to help with diagnostics and deployment for non-developers. A lot of lab techs or operators don’t want to dive into ros2 topic hz or parse logs — they just want to ask simple questions like “why isn’t the arm responding?” or “is this topic publishing?”.
A natural language layer over ROS could make debugging and deployment way easier for non-technical users — almost like having a conversational ros2 doctor or ros2 launch.
This isn’t just a bridge between LLMs and robots, it can also be a bridge between non-developer operators and the ROS ecosystem.
For the industrial robot (in the video on the main readme page) I intentionally gave Claude no context beforehand. All of the inferences that you see there are based on information that it got through the MCP tools that let it analyze the ROS topics and services on the robot.
In fact, I had a starting prompt to ignore all context from previous conversations because this looked to be like an example of emergent behavior and I wanted to confirm that it was not picking things from my earlier conversations!
For more advanced use cases, we’re also thinking about adding validation layers and safety constraints before execution — so the MCP acts not just as a bridge, but also as a safeguard.
Does this also entail industrial (sector-agnostic) applications where mitigating actions, based on vision or other sensor data based leading indicators, can proactively be taken using LLM-directed mitigation protocols? Does it allow for non-technical users to perhaps drive debugging or other similar mitigation actions?
This is the video of interacting with and debugging an industrial robot. (A few of the other comments here have been talking about this, that we see some amount of what looks like emergent behavior) https://www.youtube.com/watch?v=SrHzC5InJDA
This is a video from a collaborator research lab controlling a Unitree Go (robot dog) https://youtu.be/RW9_FgfxWzs?si=o7tIHs5eChEy9glI