1Omnigent: Meta-Harness for Coding Agents (Claude Code, Codex, Cursor, Pi) (opens in new tab)(github.com)GitHub2fzysingularity7d ago0Save
3Unified Vision-Language Agents – Detect, Segment, OCR, Generate and More (opens in new tab)(github.com)GitHub5fzysingularity6mo ago1Save
4VLM Showdown: GPT vs. Gemini vs. Claude vs. Orion (opens in new tab)(chat.vlm.run)15fzysingularity7mo ago1Save
5Show HN: Chat with Orion – a visual agent that sees, reasons and acts (opens in new tab)(chat.vlm.run)22fzysingularity7mo ago10Save
6ChatGPT uses YOLOv8 to detect UI elements (opens in new tab)(twitter.com)1fzysingularity7mo ago0Save
7Build visual AI workflows from a prompt – OCR, detection, editing and more (opens in new tab)(colab.research.google.com)5fzysingularity11mo ago4Save
8How we solved multi-modal tool-calling in MCP agents – VLM Run MCP (opens in new tab)(docs.vlm.run)14fzysingularity11mo ago6Save
9video2json – Transcribe and analyze *hours-long* videos (opens in new tab)(docs.vlm.run)3fzysingularity1y ago1Save