Most tools focus on language models (LLMs); multimodal models deserve their own evaluation tools. We believe it should be easy to compare models & publicly share evaluations.
No comments yet.